以太坊Geth RLP编码源码解析

2021-11-05 06:44:04

RLP编码

基本原理

RLP (Recursive Length Prefix) 编码唯一的目的是解决结构体编码问题。RLP编码设计的两个结构体分别为：String和List。它们的定义如下：

String指的是一串字节，对应Go语言中的 string/[]byte/uint/[]uint/[N]uint 等
List指定是由许多String和List组成的一个列表，如 struct/[]interface{} 等

比如，"\x11 \x22 \x33" 和 “abc” 为String类型，而 [“ab”, [“abc”,“a”], “\x11”] 则为List类型。

对于String类型，RLP编码的规则如下：

如果String的长度为 1 1 1，且它的值在 [ 0 x 00 , 0 x 7 f ] [0x00,0x7f] [0x00,0x7f] 之间，则其RLP编码就是字符本身，即前缀为空。前缀的值在 [ 0 x 00 , 0 x 7 f ] [0x00,0x7f] [0x00,0x7f] 之间。
否则，若String的长度小于 56 56 56，则其RLP编码是前缀拼接上字符本身，其前缀为 0 x 80 0x80 0x80 加上String长度。前缀的值在 [ 0 x 80 , 0 x b 7 ] [0x80,0xb7] [0x80,0xb7] 之间。
否则，若String的长度小于 2 64 2^{64} 264，则其RLP编码是前缀拼接上字符长度再拼接上字符本身，其前缀为 0 x b 7 0xb7 0xb7 加上String长度的二进制表示的字节长度。比如，一个String的长度为 1024 1024 1024，则其长度的二进制表示为 b 00000100 00000000 b00000100\quad00000000 b0000010000000000，其对应的十六进制表示为 0 x 0400 0x0400 0x0400，字节长度为 2 2 2。前缀的值在 [ 0 x b 8 , 0 x b f ] [0xb8,0xbf] [0xb8,0xbf] 之间。
否则，即String的长度大于等于 2 64 2^{64} 264，则其RLP编码为 ∅ \varnothing ∅。

从编码规则中可以知道，前缀的值和String的长度是有关的。另外需要注意，String长度为0的情况以及String长度为1但它的值不在 [ 0 x 00 , 0 x 7 f ] [0x00,0x7f] [0x00,0x7f] 中的情况是包含在第2条规则中的。

对于List类型，RLP编码的规则如下：

如果List中各项的RLP编码拼接起来长度小于 56 56 56，则其RLP编码是前缀拼接上各项的RLP编码，其前缀为 0 x c 0 0xc0 0xc0 加上各项的RLP编码拼接后的总长度。前缀的值在 [ 0 x c 0 , 0 x f 7 ] [0xc0,0xf7] [0xc0,0xf7] 之间。
否则，如果List中各项的RLP编码拼接起来长度小于 2 64 2^{64} 264，则其RLP编码是前缀拼接上各项编码拼接后的总长度再拼接上各项的RLP编码，其前缀为 0 x f 7 0xf7 0xf7 加上总长度的二进制表示的字节长度。前缀的值在 [ 0 x f 8 , 0 x f f ] [0xf8,0xff] [0xf8,0xff] 之间。

由于列表是任意嵌套的，因此列表的编码是递归的。

下图展示了RLP编码的过程。其中，黑色字段表示需要编码的数据，红色字段表示编码的结果。叶子节点表示一个String而非叶子节点表示List，并且非叶子节点代表的List中的各项由其孩子组成。由于编码的前缀字段反映了数据的类型以及长度信息，因此我们可以通过前缀字段对编码的数据进行解码。

此外，对于空字符串以及空列表，它们的编码分别为 “\x80” 和 “\xc0”。

具体实现

geth对于RLP编码的实现位于 go-ethereum/rlp 文件夹下，其中主要的代码文件包括 typecache.go, encode.go, decode.go。由于go语言中存在各种内置数据类型以及用户自定义类型，因此针对不同的数据类型，编码器和解码器的具体实现有所不同。

RLP编码模块的总体框架如下：

该模块的导出函数分别为 EncodeToBytes/Encode/EncodeToReader 和 Decode/DecodeBytes。根据需求不同，外界可以调用这个包中的不同的函数。根据编码或者解码的数据类型，导出函数利用typecache.go 中的 cachedTypeInfo 函数获取相应的编码器或者解码器。typecache.go 中会暂存相应类型的编码器或者解码器在 typeCache 变量中。在利用cachedTypeInfo 获取编码器或者解码器时，如果相应类型的编码器或者解码器还未被暂存，则会调用 encode.go 中的 makeWriter 和 decode.go 中的 makeDecoder 生成对应的编码器和解码器，并将其暂存下来。在下次需要用到这个编码器或解码器时，便可以直接从 typeCache 中获取。

typecache.go

typecache.go 中通过 typeCache 变量保存已经生成的编码器和解码器（针对不同的go语言数据类型，需要用到不同的编码器和解码器），并利用 typeCacheMutex 实现对 typeCache 的互斥访问。它们的定义如下：

var (
	typeCacheMutex sync.RWMutex
	typeCache      = make(map[typekey]*typeinfo)
)

type typekey struct {
	reflect.Type
	tags
}

type typeinfo struct {
	decoder    decoder
	decoderErr error // error from makeDecoder
	writer     writer
	writerErr  error // error from makeWriter
}

// tags represents struct tags.
type tags struct {
	// rlp:"nil" controls whether empty input results in a nil pointer.
	nilOK bool

	// This controls whether nil pointers are encoded/decoded as empty strings
	// or empty lists.
	nilKind Kind

	// rlp:"tail" controls whether this field swallows additional list
	// elements. It can only be set for the last field, which must be
	// of slice type.
	tail bool

	// rlp:"-" ignores fields.
	ignored bool
}

type decoder func(*Stream, reflect.Value) error

type writer func(reflect.Value, *encbuf) error

typekey 中保存了相应的类型信息，typeinfo 中保存了对应类型的编码器和解码器（函数），如果在创建编码器和解码器时出错，它会记录相应的错误信息。另外，对于结构体中的字段，它们的类型信息还包括相应的 tags。

如对于以下结构体：

type S struct{
	X uint
    Y uint `rlp:"nil"`
}

则在对结构体 S 编码时会忽略其中的的字段 Y。

typecache.go 中提供了 cachedTypeInfo 函数来获取对应类型的编码器和解码器，如果不存在，则通过 generate 函数生成。

func cachedTypeInfo(typ reflect.Type, tags tags) *typeinfo {
    typeCacheMutex.Lock()
	defer typeCacheMutex.Unlock()
	key := typekey{typ, tags}
	info := typeCache[key]
	if info != nil {
		return info
	}
	info = new(typeinfo)
	typeCache[key] = info
	info.generate(typ, tags)
	return info
}

另外，typecache.go 中还提供了 structFields 函数来解析结构体中的每一个字段（导出字段，首字母大写），并根据相应字段的类型及 tags 信息创建对应的编码器和解码器保存在 typeCache 中。

encode.go

encode.go 中提供了 Encode/EncodeToBytes/EncodeToReader 三个导出函数来实现RLP编码的不同输出形式。这些函数都会调用 encode 函数来进行编码，其中 encode 函数又会从 typeCache 中获取相应的编码器进行编码操作。

func (w *encbuf) encode(val interface{}) error {
	rval := reflect.ValueOf(val)
	writer, err := cachedWriter(rval.Type())
	if err != nil {
		return err
	}
	return writer(rval, w)
}

其中借助到了 encbuf 这一结构体，它的主要作用是暂存编码的中间结果，并通过它可以获取最终的编码结果。它的定义如下：

type encbuf struct {
	str      []byte        // string data, contains everything except list headers
	lheads   []listhead    // all list headers
	lhsize   int           // sum of sizes of all encoded list headers
	sizebuf  [9]byte       // auxiliary buffer for uint encoding
	bufvalue reflect.Value // used in writeByteArrayCopy
}

type listhead struct {
	offset int // index of this header in string data
	size   int // total size of encoded data (including list headers)
}

注意到 str 中存储的不是最终RLP编码的结果，因为它不包含List类型的头部，而List类型的头部则被存储在 lheads 中，最终的编码结果可以由这两项导出（通过 toBytes 和 toWriter 函数实现）。listhead 中存储了相应List类型头部在编码中的偏移 offset 及整个List类型的大小 size（注释中包含头部的含义是如果是嵌套List，则 size的大小包含了其中子List加上其头部的大小，但是不包含自身头部的大小）。

encode.go 中实现了针对go语言中各种不同类型的编码器的实现，cachedTypeInfo 函数中 generate 函数就会调用其中的 makeWriter 来生成相应的编码器。

// makeWriter creates a writer function for the given type.
func makeWriter(typ reflect.Type, ts tags) (writer, error) {
	kind := typ.Kind()
	switch {
	case typ == rawValueType:
		return writeRawValue, nil
	case typ.AssignableTo(reflect.PtrTo(bigInt)):
		return writeBigIntPtr, nil
	case typ.AssignableTo(bigInt):
		return writeBigIntNoPtr, nil
	case kind == reflect.Ptr:
		return makePtrWriter(typ, ts)
	case reflect.PtrTo(typ).Implements(encoderInterface):
		return makeEncoderWriter(typ), nil
	case isUint(kind):
		return writeUint, nil
	case kind == reflect.Bool:
		return writeBool, nil
	case kind == reflect.String:
		return writeString, nil
	case kind == reflect.Slice && isByte(typ.Elem()):
		return writeBytes, nil
	case kind == reflect.Array && isByte(typ.Elem()):
		return makeByteArrayWriter(typ), nil
	case kind == reflect.Slice || kind == reflect.Array:
		return makeSliceWriter(typ, ts)
	case kind == reflect.Struct:
		return makeStructWriter(typ)
	case kind == reflect.Interface:
		return writeInterface, nil
	default:
		return nil, fmt.Errorf("rlp: type %v is not RLP-serializable", typ)
	}
}

通过传入相应的类型及标志，makeWriter 函数会返回相应的编码器。

String类型

这里以 Uint 为例，注意所有go语言中的具体类型（包括用户自定义类型）都可以被抽象为String类型和List类型，其中Uint可以被抽象为String类型。

case isUint(kind):
	return writeUint, nil

func isUint(k reflect.Kind) bool {
	return k >= reflect.Uint && k <= reflect.Uintptr
}

如果类型属于 Uint，则对应的编码器为 writeUint 函数，其定义如下：

func writeUint(val reflect.Value, w *encbuf) error {
	w.encodeUint(val.Uint())
	return nil
}

func (w *encbuf) encodeUint(i uint64) {
	if i == 0 {
		w.str = append(w.str, 0x80)
	} else if i < 128 {
		// fits single byte
		w.str = append(w.str, byte(i))
	} else {
		s := putint(w.sizebuf[1:], i)
		w.sizebuf[0] = 0x80 + byte(s)
		w.str = append(w.str, w.sizebuf[:s+1]...)
	}
}

它的编码形式其实就是将其抽象成String类型按前面所述的规则进行编码。

List类型

这里以 Struct 为例，它可以被抽象为List类型：

case kind == reflect.Struct:
	return makeStructWriter(typ)

其中 makeStructWriter 函数的定义如下：

func makeStructWriter(typ reflect.Type) (writer, error) {
	fields, err := structFields(typ)
	if err != nil {
		return nil, err
	}
	for _, f := range fields {
		if f.info.writerErr != nil {
			return nil, structFieldError{typ, f.index, f.info.writerErr}
		}
	}
	writer := func(val reflect.Value, w *encbuf) error {
		lh := w.list()
		for _, f := range fields {
			if err := f.info.writer(val.Field(f.index), w); err != nil {
				return err
			}
		}
		w.listEnd(lh)
		return nil
	}
	return writer, nil
}

该函数返回一个该 Struct 对应的编码器，这个编码器会解析 Struct 中的各个字段，并对各个字段采用相应的编码方式进行编码。同时，注意到，由于 Struct 可以抽象为List类型，因此，在对其进行编码开始已经编码结束时会分别调用 list 以及 listEnd 函数以设置 encbuf 中 lhead s一项的 offset 以及 size 项。这里给出 list 函数及 listEnd 函数的定义：

// list adds a new list header to the header stack. It returns the index
// of the header. The caller must call listEnd with this index after encoding
// the content of the list.
func (w *encbuf) list() int {
	w.lheads = append(w.lheads, listhead{offset: len(w.str), size: w.lhsize})
	return len(w.lheads) - 1
}

func (w *encbuf) listEnd(index int) {
	lh := &w.lheads[index]
	lh.size = w.size() - lh.offset - lh.size
	if lh.size < 56 {
		w.lhsize++ // length encoded into kind tag
	} else {
		w.lhsize += 1 + intsize(uint64(lh.size))
	}
}

list 函数创建一个 listhead 结构体加入到 encbuf 的 lheads 中， listEnd 函数则会在List类型编码结束后设置对应List类型的大小 size字段。

其他各种go语言类型变量都可以被抽象为String类型或者List类型，具体编码方式在细节上有所不同，但总体上都是按照基本原理处所说的方式进行编码

其他细节

编码完成后的信息被存放在 encbuf 结构体中，但里面存储的不是最终的RLP编码，而是一个中间结果。我们可以通过调用 toBytes 函数将其转换成字节串的形式输出。至于为什么不直接存储最终的RLP编码结果，我想是因为这样可以让用户选择不同的输出形式，如 toBytes 或者 toWriter。其中 toBytes 函数的内容如下：

func (w *encbuf) toBytes() []byte {
	out := make([]byte, w.size())
	strpos := 0
	pos := 0
	for _, head := range w.lheads {
		// write string data before header
		n := copy(out[pos:], w.str[strpos:head.offset])
		pos += n
		strpos += n
		// write the header
		enc := head.encode(out[pos:])
		pos += len(enc)
	}
	// copy string data after the last list header
	copy(out[pos:], w.str[strpos:])
	return out
}

这个函数将最终的RLP编码结果写入到 out 中，并输出。因此需要不断从 w *encbuf 中向 out 拷贝数据，其中 pos 表示当前拷贝到 out 中的位置， strpos 表示当前拷贝在 w.str 中的位置，这段代码的最终结果会将 w.str 的内容在适当位置加上List类型头部并写入 out 中，返回。

decode.go

decode.go 中提供了 Decode/DecodeBytes 两个个导出函数来实现从不同输入源获取RLP编码并进行解码。这些函数都会调用 Decode 函数来进行解码，其中 encode 函数又会从 typeCache 中获取相应的解码器进行编码操作。

// Decode decodes a value and stores the result in the value pointed
// to by val. Please see the documentation for the Decode function
// to learn about the decoding rules.
func (s *Stream) Decode(val interface{}) error {
	if val == nil {
		return errDecodeIntoNil
	}
	rval := reflect.ValueOf(val)
	rtyp := rval.Type()
	if rtyp.Kind() != reflect.Ptr {
		return errNoPointer
	}
	if rval.IsNil() {
		return errDecodeIntoNil
	}
	decoder, err := cachedDecoder(rtyp.Elem())
	if err != nil {
		return err
	}

	err = decoder(s, rval.Elem())
	if decErr, ok := err.(*decodeError); ok && len(decErr.ctx) > 0 {
		// add decode target type to error so context has more meaning
		decErr.ctx = append(decErr.ctx, fmt.Sprint("(", rtyp.Elem(), ")"))
	}
	return err
}

其中要求 val 是一个指向go语言类型的指针，如果其不是指针或者指针为空，则返回相应的错误。否则，在 typeCache 中查询相应的解码器进行解码。

解码的过程中需要借助 Stream 这一结构体，它用来存储待解码的数据以及一些解码过程中的中间信息，其定义如下：

type Stream struct {
	r ByteReader

	// number of bytes remaining to be read from r.
	remaining uint64
	limited   bool

	// auxiliary buffer for integer decoding
	uintbuf []byte

	kind    Kind   // kind of value ahead
	size    uint64 // size of value ahead
	byteval byte   // value of single byte in type tag
	kinderr error  // error from last readKind
	stack   []listpos
}
// Kind 为 -1, Byte, String, List 其中之一

type listpos struct{ pos, size uint64 }
//pos为已经读取的数据在整个List类型中的位置，size为List类型原始数据的大小

其中，r 中存储了待解码的数据，需要不断从中读取数据并解码。每次进行解码，需要调用 Reset 函数初始化待解码的数据，在调用重置函数 Reset 时也会重置remaining, limited, uintbuf, kind, size, byteval, kinderr, stack 这些字段。如果是待解码的数据是字节串，则会设置 limited 为 true，同时设置 remaining 为其长度。uintbuf 用来暂存如字节串长度等信息，kind表示待解码的数据类型（初始化为-1），size表示原生的数据长度，如果 kind 类型为 Byte 则 byteval 为单字节数据的值，kinderr 为读取类型时遇到的错误，stack 主要用于处理嵌套列表的情况。

String类型

这里以 Uint 为例，观察其具体的解码器是怎么实现的

case isUint(kind):
	return decodeUint, nil
	
func decodeUint(s *Stream, val reflect.Value) error {
	typ := val.Type()
	num, err := s.uint(typ.Bits())
	if err != nil {
		return wrapStreamError(err, val.Type())
	}
	val.SetUint(num)
	return nil
}

func (s *Stream) uint(maxbits int) (uint64, error) {
	kind, size, err := s.Kind()
	if err != nil {
		return 0, err
	}
	switch kind {
	case Byte:
		if s.byteval == 0 {
			return 0, ErrCanonInt
		}
		s.kind = -1 // rearm Kind 重置类型
		return uint64(s.byteval), nil
	case String:
		if size > uint64(maxbits/8) {
			return 0, errUintOverflow
		}
		v, err := s.readUint(byte(size))
		switch {
		case err == ErrCanonSize:
			// Adjust error because we're not reading a size right now.
			return 0, ErrCanonInt
		case err != nil:
			return 0, err
		case size > 0 && v < 128:
			return 0, ErrCanonSize
		default:
			return v, nil
		}
	default:
		return 0, ErrExpectedString
	}
}

decodeUint 函数获取要解码数据类型的比特数，传入 Stream 结构体的成员函数 uint。 uint 函数首先调用 Kind 函数获取接下来要解码数据的类型 kind （Byte/String/List）以及原始数据的大小 size（所占字节数），如果调用 Kind 函数中出错，则会返回相应的错误信息到 err 中。如果没有出错，那么根据 Kind 函数获取的待解码数据类型，如果待解码数据为单字节 Byte，那么其原生数据和编码后的数据是相同的，Byte类型的数据值在调用 Kind 函数时会被存放在 Stream 的 byteval 中，直接返回它的值就可以了；如果待解码数据是 String，那么通过获取的原始数据大小 size 去读取接下来的数据即可，同时还会有相应的错误检测过程，如果无误，则返回结果。

List类型

这里以 Struct 为例，观察其具体的解码器是怎么实现的

case kind == reflect.Struct:
	return makeStructDecoder(typ)

func makeStructDecoder(typ reflect.Type) (decoder, error) {
	fields, err := structFields(typ)
	if err != nil {
		return nil, err
	}
	for _, f := range fields {
		if f.info.decoderErr != nil {
			return nil, structFieldError{typ, f.index, f.info.decoderErr}
		}
	}
	dec := func(s *Stream, val reflect.Value) (err error) {
		if _, err := s.List(); err != nil {
			return wrapStreamError(err, typ)
		}
		for _, f := range fields {
			err := f.info.decoder(s, val.Field(f.index))
			if err == EOL {
				return &decodeError{msg: "too few elements", typ: typ}
			} else if err != nil {
				return addErrorContext(err, "."+typ.Field(f.index).Name)
			}
		}
		return wrapStreamError(s.ListEnd(), typ)
	}
	return dec, nil
}

这里的实现形式其实与 encode.go 中的类似，只是将编码器改成了解码器。主要需要观察其中的 s.List 和 s.ListEnd 函数是怎么修改 s.stack 从而创建List类型信息的。它们的实现代码如下：

// List starts decoding an RLP list. If the input does not contain a
// list, the returned error will be ErrExpectedList. When the list's
// end has been reached, any Stream operation will return EOL.
func (s *Stream) List() (size uint64, err error) {
	kind, size, err := s.Kind()
	if err != nil {
		return 0, err
	}
	if kind != List {
		return 0, ErrExpectedList
	}
	s.stack = append(s.stack, listpos{0, size})
	s.kind = -1
	s.size = 0
	return size, nil
}

// ListEnd returns to the enclosing list.
// The input reader must be positioned at the end of a list.
func (s *Stream) ListEnd() error {
	if len(s.stack) == 0 {
		return errNotInList
	}
	tos := s.stack[len(s.stack)-1]
	if tos.pos != tos.size {
		return errNotAtEOL
	}
	s.stack = s.stack[:len(s.stack)-1] // pop
	if len(s.stack) > 0 {
		s.stack[len(s.stack)-1].pos += tos.size
	}
	s.kind = -1
	s.size = 0
	return nil
}

List 函数通过调用 s.Kind 来获取待解码数据的类型及大小，如果待解码的数据类型为List类型，则设置 s.stack，之后需要重置 s.kind 和 s.size。ListEnd 函数表示已经完成对一个 List 类型的解析，因此需要将其从 s.stack 中取出，在取出之前需要比较它的 pos 字段和 size 字段来确保已经读完整个List，最后如果存在嵌套列表，则还需要对其父List的 pos 加上相应的长度，表示这段已经解析过了。

其他细节

由于所有类型的编码器的实现都借助到了 Stream 结构体中的成员函数，因此需要查看一下其中一些重要的成员函数是怎么实现的。

对于 Kind 函数：

// Kind returns the kind and size of the next value in the
// input stream.
//
// The returned size is the number of bytes that make up the value.
// For kind == Byte, the size is zero because the value is
// contained in the type tag.
//
// The first call to Kind will read size information from the input
// reader and leave it positioned at the start of the actual bytes of
// the value. Subsequent calls to Kind (until the value is decoded)
// will not advance the input reader and return cached information.
func (s *Stream) Kind() (kind Kind, size uint64, err error) {
	var tos *listpos
	if len(s.stack) > 0 {
		tos = &s.stack[len(s.stack)-1]
	}
	if s.kind < 0 {
		s.kinderr = nil
		// Don't read further if we're at the end of the
		// innermost list.
		if tos != nil && tos.pos == tos.size {
			return 0, 0, EOL
		}
		s.kind, s.size, s.kinderr = s.readKind()
		if s.kinderr == nil {
			if tos == nil {
				// At toplevel, check that the value is smaller
				// than the remaining input length.
				if s.limited && s.size > s.remaining {
					s.kinderr = ErrValueTooLarge
				}
			} else {
				// Inside a list, check that the value doesn't overflow the list.
				if s.size > tos.size-tos.pos {
					s.kinderr = ErrElemTooLarge
				}
			}
		}
	}
	// Note: this might return a sticky error generated
	// by an earlier call to readKind.
	return s.kind, s.size, s.kinderr
}

Kind 函数返回输入流中下一个需要解码的数据的类型及大小，同时在获取数据大小由于需要从输入流中读取数据，因此会更新输入流中读指针的位置。首先获取s.stack 中的最后一个元素，如果当前解码的类型为一个列表，那么获取的内容肯定不为空（利用这个可以检测需要解码的数据是否已经超出了整个列表的长度）。

当已经成功处理完一个待解码的数据（此时s.kind==-1）,那么会通过调用 s.readKind 函数获取下一个要处理的数据类型，大小等信息，另外还需要检查这个大小信息是否已经超出了长度限制；否则，直接返回当前待处理的数据类型，大小等信息。s.readKind 函数的定义如下：

func (s *Stream) readKind() (kind Kind, size uint64, err error) {
	b, err := s.readByte()
	if err != nil {
		if len(s.stack) == 0 {
			// At toplevel, Adjust the error to actual EOF. io.EOF is
			// used by callers to determine when to stop decoding.
			switch err {
			case io.ErrUnexpectedEOF:
				err = io.EOF
			case ErrValueTooLarge:
				err = io.EOF
			}
		}
		return 0, 0, err
	}
	s.byteval = 0
	switch {
	case b < 0x80:
		// For a single byte whose value is in the [0x00, 0x7F] range, that byte
		// is its own RLP encoding.
		s.byteval = b
		return Byte, 0, nil
	case b < 0xB8:
		// Otherwise, if a string is 0-55 bytes long,
		// the RLP encoding consists of a single byte with value 0x80 plus the
		// length of the string followed by the string. The range of the first
		// byte is thus [0x80, 0xB7].
		return String, uint64(b - 0x80), nil
	case b < 0xC0:
		// If a string is more than 55 bytes long, the
		// RLP encoding consists of a single byte with value 0xB7 plus the length
		// of the length of the string in binary form, followed by the length of
		// the string, followed by the string. For example, a length-1024 string
		// would be encoded as 0xB90400 followed by the string. The range of
		// the first byte is thus [0xB8, 0xBF].
		size, err = s.readUint(b - 0xB7)
		if err == nil && size < 56 {
			err = ErrCanonSize
		}
		return String, size, err
	case b < 0xF8:
		// If the total payload of a list
		// (i.e. the combined length of all its items) is 0-55 bytes long, the
		// RLP encoding consists of a single byte with value 0xC0 plus the length
		// of the list followed by the concatenation of the RLP encodings of the
		// items. The range of the first byte is thus [0xC0, 0xF7].
		return List, uint64(b - 0xC0), nil
	default:
		// If the total payload of a list is more than 55 bytes long,
		// the RLP encoding consists of a single byte with value 0xF7
		// plus the length of the length of the payload in binary
		// form, followed by the length of the payload, followed by
		// the concatenation of the RLP encodings of the items. The
		// range of the first byte is thus [0xF8, 0xFF].
		size, err = s.readUint(b - 0xF7)
		if err == nil && size < 56 {
			err = ErrCanonSize
		}
		return List, size, err
}

这个函数通过解析编码的头部信息，来返回被编码数据的原始类型（Byte/String/List）和大小信息。其中 s.readUint() 函数的作用是读取相应大小的字节数据。

reflect库说明

为了方便实现对不同go数据类型的编码解码，需要借助reflect库，其具体说明见这里

码农公寓

RLP编码

基本原理

具体实现

typecache.go

encode.go

String类型

List类型

其他细节

decode.go

String类型

List类型

其他细节

reflect库说明

相关文章