xia_coder.coder.Coder

class xia_coder.coder.Coder(doc_class, data_encode: Optional[str] = None, data_format: Optional[str] = None)

Bases: object

Default data coder

Using display data:
  • gzip compressed

  • record format (list of dictionary)

__init__(doc_class, data_encode: Optional[str] = None, data_format: Optional[str] = None)

Methods

__init__(doc_class[, data_encode, data_format])

append_content(doc, file_obj)

Append document to content

encode(doc_list, file_obj)

Encode the list of document into bytes

io_to_record(file_obj)

A very special json parser as each segment is in the form of list of dictionary

io_to_utf8(file_obj)

Read block by block until it is possible to be parsed as a valid

parse_content(file_obj)

Attributes

decoder

default_encode

default_format

supported_encodes

supported_formats

append_content(doc: Document, file_obj) int

Append document to content

Returns

Appended data size

encode(doc_list: List[Document], file_obj) int

Encode the list of document into bytes

Parameters
  • doc_list – document list

  • file_obj – file-like object to write with

Returns

Encoded data size

io_to_record(file_obj)

A very special json parser as each segment is in the form of list of dictionary

classmethod io_to_utf8(file_obj)

Read block by block until it is possible to be parsed as a valid

Parameters

file_obj – python file-like object