SentenceSplitter

SentenceSplitter = object

Defined in: packages/cloud/src/client/types.gen.ts:5664

Parse text with a preference for complete sentences.

In general, this class tries to keep sentences and paragraphs together. Therefore compared to the original TokenTextSplitter, there are less likely to be hanging sentences or parts of sentences at the end of the node chunk.

Properties

include_metadata?

optional include_metadata: boolean

Defined in: packages/cloud/src/client/types.gen.ts:5668

Whether or not to consider metadata when splitting.

include_prev_next_rel?

optional include_prev_next_rel: boolean

Defined in: packages/cloud/src/client/types.gen.ts:5672

Include prev/next node relationships.

callback_manager?

optional callback_manager: unknown

Defined in: packages/cloud/src/client/types.gen.ts:5673

id_func?

optional id_func: string | null

Defined in: packages/cloud/src/client/types.gen.ts:5677

Function to generate node IDs.

chunk_size?

optional chunk_size: number

Defined in: packages/cloud/src/client/types.gen.ts:5681

The token chunk size for each chunk.

chunk_overlap?

optional chunk_overlap: number

Defined in: packages/cloud/src/client/types.gen.ts:5685

The token overlap of each chunk when splitting.

separator?

optional separator: string

Defined in: packages/cloud/src/client/types.gen.ts:5689

Default separator for splitting into words

paragraph_separator?

optional paragraph_separator: string

Defined in: packages/cloud/src/client/types.gen.ts:5693

Separator between paragraphs.

secondary_chunking_regex?

optional secondary_chunking_regex: string | null

Defined in: packages/cloud/src/client/types.gen.ts:5697

Backup regex for splitting into sentences.

class_name?

optional class_name: string

Defined in: packages/cloud/src/client/types.gen.ts:5698

SentenceSplitter

On this page