xia_diff_match_patch.diff.DiffMatchPatch

class xia_diff_match_patch.diff.DiffMatchPatch

Bases: object

Class containing the diff, match and patch methods.

Also contains the behaviour settings.

__init__(): Inits a diff_match_patch object with default settings. Redefine these in your program to override the defaults.

Methods

`__init__`()	Inits a diff_match_patch object with default settings.
`diff_bisect`(text1, text2, deadline)	Find the 'middle snake' of a diff, split the problem in two
`diff_bisectSplit`(text1, text2, x, y, deadline)	Given the location of the 'middle snake', split the diff in two parts and recurse.
`diff_charsToLines`(diffs, lineArray)	Rehydrate the text in a diff from a string of line hashes to real lines of text.
`diff_cleanupEfficiency`(diffs)	Reduce the number of edits by eliminating operationally trivial equalities.
`diff_cleanupMerge`(diffs)	Reorder and merge like edit sections.
`diff_cleanupSemantic`(diffs)	Reduce the number of edits by eliminating semantically trivial equalities.
`diff_cleanupSemanticLossless`(diffs)	Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary.
`diff_commonOverlap`(text1, text2)	Determine if the suffix of one string is the prefix of another.
`diff_commonPrefix`(text1, text2)	Determine the common prefix of two strings.
`diff_commonSuffix`(text1, text2)	Determine the common suffix of two strings.
`diff_compute`(text1, text2, checklines, deadline)	Find the differences between two texts. Assumes that the texts do not
`diff_fromDelta`(text1, delta)	Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.
`diff_halfMatch`(text1, text2)	Do the two texts share a substring which is at least half the length of the longer text? This speedup can produce non-minimal diffs.
`diff_levenshtein`(diffs)	Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.
`diff_lineMode`(text1, text2, deadline)	Do a quick line-level diff on both strings, then rediff the parts for
`diff_linesToChars`(text1, text2)	Split two texts into an array of strings.
`diff_main`(text1, text2[, checklines, deadline])	Find the differences between two texts. Simplifies the problem by
`diff_prettyHtml`(diffs)	Convert a diff array into a pretty HTML report.
`diff_text1`(diffs)	Compute and return the source text (all equalities and deletions).
`diff_text2`(diffs)	Compute and return the destination text (all equalities and insertions).
`diff_toDelta`(diffs)	Crush the diff into an encoded string which describes the operations required to transform text1 into text2.
`diff_xIndex`(diffs, loc)	loc is a location in text1, compute and return the equivalent location in text2.
`match_alphabet`(pattern)	Initialise the alphabet for the Bitap algorithm.
`match_bitap`(text, pattern, loc)	Locate the best instance of 'pattern' in 'text' near 'loc' using the Bitap algorithm.
`match_main`(text, pattern, loc)	Locate the best instance of 'pattern' in 'text' near 'loc'.
`patch_addContext`(patch, text)	Increase the context until it is unique, but don't let the pattern expand beyond Match_MaxBits.
`patch_addPadding`(patches)	Add some padding on text start and end so that edges can match something.
`patch_apply`(patches, text)	Merge a set of patches onto the text.
`patch_deepCopy`(patches)	Given an array of patches, return another array that is identical.
`patch_fromText`(textline)	Parse a textual representation of patches and return a list of patch objects.
`patch_make`(a[, b, c])	Compute a list of patches to turn text1 into text2.
`patch_splitMax`(patches)	Look through the patches and break up any which are longer than the maximum limit of the match algorithm.
`patch_toText`(patches)	Take a list of patches and return a textual representation.

Attributes

`BLANKLINEEND`
`BLANKLINESTART`
`DIFF_DELETE`
`DIFF_EQUAL`
`DIFF_INSERT`

diff_bisect(text1, text2, deadline)

Find the ‘middle snake’ of a diff, split the problem in two: and return the recursively constructed diff. See Myers 1986 paper: An O(ND) Difference Algorithm and Its Variations.

Parameters

text1 – Old string to be diffed.
text2 – New string to be diffed.
deadline – Time at which to bail if not yet complete.

Returns

Array of diff tuples.

diff_bisectSplit(text1, text2, x, y, deadline)

Given the location of the ‘middle snake’, split the diff in two parts and recurse.

Parameters

text1 – Old string to be diffed.
text2 – New string to be diffed.
x – Index of split point in text1.
y – Index of split point in text2.
deadline – Time at which to bail if not yet complete.

Returns

Array of diff tuples.

diff_charsToLines(diffs, lineArray)

Rehydrate the text in a diff from a string of line hashes to real lines of text.

Parameters

diffs – Array of diff tuples.
lineArray – Array of unique strings.

diff_cleanupEfficiency(diffs)

Reduce the number of edits by eliminating operationally trivial equalities.

Parameters: diffs – Array of diff tuples.

diff_cleanupMerge(diffs)

Reorder and merge like edit sections. Merge equalities. Any edit section can move as long as it doesn’t cross an equality.

Parameters: diffs – Array of diff tuples.

diff_cleanupSemantic(diffs)

Reduce the number of edits by eliminating semantically trivial equalities.

Parameters: diffs – Array of diff tuples.

diff_cleanupSemanticLossless(diffs)

Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary. e.g: The c<ins>at c</ins>ame. -> The <ins>cat </ins>came.

Parameters: diffs – Array of diff tuples.

diff_commonOverlap(text1, text2)

Determine if the suffix of one string is the prefix of another.

Parameters

string. (text2 Second) –
string. –

Returns

The number of characters common to the end of the first string and the start of the second string.

diff_commonPrefix(text1, text2)

Determine the common prefix of two strings.

Parameters

text1 – First string.
text2 – Second string.

Returns

The number of characters common to the start of each string.

diff_commonSuffix(text1, text2)

Determine the common suffix of two strings.

Parameters

text1 – First string.
text2 – Second string.

Returns

The number of characters common to the end of each string.

diff_compute(text1, text2, checklines, deadline)

Find the differences between two texts. Assumes that the texts do not: have any common prefix or suffix.

Parameters

text1 – Old string to be diffed.
text2 – New string to be diffed.
checklines – Speedup flag. If false, then don’t run a line-level diff first to identify the changed areas. If true, then run a faster, slightly less optimal diff.
deadline – Time when the diff should be complete by.

Returns

Array of changes.

diff_fromDelta(text1, delta)

Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.

Parameters

text1 – Source string for the diff.
delta – Delta text.

Returns

Array of diff tuples.

Raises

ValueError – If invalid input.

diff_halfMatch(text1, text2)

Do the two texts share a substring which is at least half the length of the longer text? This speedup can produce non-minimal diffs.

Parameters

text1 – First string.
text2 – Second string.

Returns

Five element Array, containing the prefix of text1, the suffix of text1, the prefix of text2, the suffix of text2 and the common middle. Or None if there was no match.

diff_levenshtein(diffs)

Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.

Parameters: diffs – Array of diff tuples.
Returns: Number of changes.

diff_lineMode(text1, text2, deadline)

Do a quick line-level diff on both strings, then rediff the parts for: greater accuracy. This speedup can produce non-minimal diffs.

Parameters

text1 – Old string to be diffed.
text2 – New string to be diffed.
deadline – Time when the diff should be complete by.

Returns

Array of changes.

diff_linesToChars(text1, text2)

Split two texts into an array of strings. Reduce the texts to a string of hashes where each Unicode character represents one line.

Parameters

text1 – First string.
text2 – Second string.

Returns

Three element tuple, containing the encoded text1, the encoded text2 and the array of unique strings. The zeroth element of the array of unique strings is intentionally blank.

diff_main(text1, text2, checklines=True, deadline=None)

Find the differences between two texts. Simplifies the problem by: stripping any common prefix or suffix off the texts before diffing.

Parameters

text1 – Old string to be diffed.
text2 – New string to be diffed.
checklines – Optional speedup flag. If present and false, then don’t run a line-level diff first to identify the changed areas. Defaults to true, which does a faster, slightly less optimal diff.
deadline – Optional time when the diff should be complete by. Used internally for recursive calls. Users should set DiffTimeout instead.

Returns

Array of changes.

diff_prettyHtml(diffs)

Convert a diff array into a pretty HTML report.

Parameters: diffs – Array of diff tuples.
Returns: HTML representation.

diff_text1(diffs)

Compute and return the source text (all equalities and deletions).

Parameters: diffs – Array of diff tuples.
Returns: Source text.

diff_text2(diffs)

Compute and return the destination text (all equalities and insertions).

Parameters: diffs – Array of diff tuples.
Returns: Destination text.

diff_toDelta(diffs)

Crush the diff into an encoded string which describes the operations required to transform text1 into text2. E.g. =3 -2 +ing -> Keep 3 chars, delete 2 chars, insert ‘ing’. Operations are tab-separated. Inserted text is escaped using %xx notation.

Parameters: diffs – Array of diff tuples.
Returns: Delta text.

diff_xIndex(diffs, loc)

loc is a location in text1, compute and return the equivalent location in text2. e.g. “The cat” vs “The big cat”, 1->1, 5->8

Parameters

diffs – Array of diff tuples.
loc – Location within text1.

Returns

Location within text2.

match_alphabet(pattern)

Initialise the alphabet for the Bitap algorithm.

Parameters: pattern – The text to encode.
Returns: Hash of character locations.

match_bitap(text, pattern, loc)

Locate the best instance of ‘pattern’ in ‘text’ near ‘loc’ using the Bitap algorithm.

Parameters

text – The text to search.
pattern – The pattern to search for.
loc – The location to search around.

Returns

Best match index or -1.

match_main(text, pattern, loc)

Locate the best instance of ‘pattern’ in ‘text’ near ‘loc’.

Parameters

text – The text to search.
pattern – The pattern to search for.
loc – The location to search around.

Returns

Best match index or -1.

patch_addContext(patch, text)

Increase the context until it is unique, but don’t let the pattern expand beyond Match_MaxBits.

Parameters

patch – The patch to grow.
text – Source text.

patch_addPadding(patches)

Add some padding on text start and end so that edges can match something. Intended to be called only from within patch_apply.

Parameters: patches – Array of Patch objects.
Returns: The padding string added to each side.

patch_apply(patches, text)

Merge a set of patches onto the text. Return a patched text, as well as a list of true/false values indicating which patches were applied.

Parameters

patches – Array of Patch objects.
text – Old text.

Returns

Two element Array, containing the new text and an array of boolean values.

patch_deepCopy(patches)

Given an array of patches, return another array that is identical.

Parameters: patches – Array of Patch objects.
Returns: Array of Patch objects.

patch_fromText(textline)

Parse a textual representation of patches and return a list of patch objects.

Parameters: textline – Text representation of patches.
Returns: Array of Patch objects.
Raises: ValueError – If invalid input.

patch_make(a, b=None, c=None)

Compute a list of patches to turn text1 into text2. Use diffs if provided, otherwise compute it ourselves. There are four ways to call this function, depending on what data is available to the caller: Method 1: a = text1, b = text2 Method 2: a = diffs Method 3 (optimal): a = text1, b = diffs Method 4 (deprecated, use method 3): a = text1, b = text2, c = diffs

Parameters

a – text1 (methods 1,3,4) or Array of diff tuples for text1 to text2 (method 2).
b – text2 (methods 1,4) or Array of diff tuples for text1 to text2 (method 3) or undefined (method 2).
c – Array of diff tuples for text1 to text2 (method 4) or undefined (methods 1,2,3).

Returns

Array of Patch objects.

patch_splitMax(patches)

Look through the patches and break up any which are longer than the maximum limit of the match algorithm. Intended to be called only from within patch_apply.

Parameters: patches – Array of Patch objects.

patch_toText(patches)

Take a list of patches and return a textual representation.

Parameters: patches – Array of Patch objects.
Returns: Text representation of patches.