Package Biskit :: Module difflib_old

Module difflib_old

source code

Older version of difflib. Here due to compability problems.

Classes

SequenceMatcher
SequenceMatcher is a flexible class for comparing pairs of sequences of any type, so long as the sequence elements are hashable.

Differ
Differ is a class for comparing sequences of lines of text, and producing human-readable differences or deltas.

Test
Mock test

Functions

[hide private]

get_close_matches(word, possibilities, n=3, cutoff=0.6)
Use SequenceMatcher to return list of the best "good enough" matches.

_count_leading(line, ch)
Return number of `ch` characters at the start of `line`.

IS_LINE_JUNK(line, pat=re.compile(r"\s*#?\s*$").match)
Return 1 for ignorable line: iff `line` is blank or contains a single '#'.

IS_CHARACTER_JUNK(ch, ws=" \t")
Return 1 for ignorable character: iff `ch` is a space or tab.

ndiff(a, b, linejunk=IS_LINE_JUNK, charjunk=IS_CHARACTER_JUNK)
Compare `a` and `b` (lists of strings); return a `Differ`-style delta.

restore(delta, which)
Generate one of the two sequences that generated a delta.

_test()

Function Details

[hide private]

get_close_matches(word, possibilities, n=3, cutoff=0.6)

source code

Use SequenceMatcher to return list of the best "good enough" matches.

word is a sequence for which close matches are desired (typically a string).

possibilities is a list of sequences against which to match word (typically a list of strings).

Optional arg n (default 3) is the maximum number of close matches to return. n must be > 0.

Optional arg cutoff (default 0.6) is a float in [0, 1]. Possibilities that don't score at least that similar to word are ignored.

The best (no more than n) matches among the possibilities are returned in a list, sorted by similarity score, most similar first.

>>> get_close_matches("appel", ["ape", "apple", "peach", "puppy"])
['apple', 'ape']
>>> import keyword as _keyword
>>> get_close_matches("wheel", _keyword.kwlist)
['while']
>>> get_close_matches("apple", _keyword.kwlist)
[]
>>> get_close_matches("accept", _keyword.kwlist)
['except']

_count_leading(line, ch)

source code

Return number of `ch` characters at the start of `line`.

Example:

>>> _count_leading('   abc', ' ')
3

IS_LINE_JUNK(line, pat=re.compile(r"\s#?\s$").match)

source code

Return 1 for ignorable line: iff `line` is blank or contains a single '#'.

Examples:

>>> IS_LINE_JUNK('\n')
1
>>> IS_LINE_JUNK('  #   \n')
1
>>> IS_LINE_JUNK('hello\n')
0

IS_CHARACTER_JUNK(ch, ws=" \t")

source code

Return 1 for ignorable character: iff `ch` is a space or tab.

Examples:

>>> IS_CHARACTER_JUNK(' ')
1
>>> IS_CHARACTER_JUNK('\t')
1
>>> IS_CHARACTER_JUNK('\n')
0
>>> IS_CHARACTER_JUNK('x')
0

ndiff(a, b, linejunk=IS_LINE_JUNK, charjunk=IS_CHARACTER_JUNK)

source code

Compare `a` and `b` (lists of strings); return a `Differ`-style delta.

Optional keyword parameters `linejunk` and `charjunk` are for filter functions (or None):

linejunk: A function that should accept a single string argument, and return true iff the string is junk. The default is module-level function IS_LINE_JUNK, which filters out lines without visible characters, except for at most one splat ('#').
charjunk: A function that should accept a string of length 1. The default is module-level function IS_CHARACTER_JUNK, which filters out whitespace characters (a blank or tab; note: bad idea to include newline in this!).

Tools/scripts/ndiff.py is a command-line front-end to this function.

Example:

>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
...              'ore\ntree\nemu\n'.splitlines(1))
>>> print ''.join(diff),
- one
?  ^
+ ore
?  ^
- two
- three
?  -
+ tree
+ emu

restore(delta, which)

source code

Generate one of the two sequences that generated a delta.

Given a `delta` produced by `Differ.compare()` or `ndiff()`, extract lines originating from file 1 or 2 (parameter `which`), stripping off line prefixes.

Examples:

>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
...              'ore\ntree\nemu\n'.splitlines(1))
>>> diff = list(diff)
>>> print ''.join(restore(diff, 1)),
one
two
three
>>> print ''.join(restore(diff, 2)),
ore
tree
emu

_test()

source code