.. @+leo-ver=5-thin
.. @+node:ekr.20031218072017.329: * @file ../doc/leoNotes.txt
.. @@language rest
.. @@killbeautify
.. @+all
.. @+node:ekr.20190123074336.1: ** Backup of public filters
https://github.com/leo-editor/leo-editor/issues/1064

This issue lists the filters that I use to organize and understand my work flow.  They are a high level overview of what will (and won't) be done.

**About two dozen significant enhancements remain**.  None of these will fundamentally change Leo.  Fixing bugs has a higher priority.

- [Curated unscheduled issues](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+no%3Amilestone+-label%3Amaybe+-label%3Ainfo+-label%3Apip+-label%3Awaiting+-label%3Alvl%3Aminor): Interesting open programming issues that have not been assigned a milestone. See the actual filter for details. Some may be assigned to 5.8.2.

- [Branch info](https://github.com/leo-editor/leo-editor/issues/1058): An info item reminding me what each branch does, and its status.

- [Important bugs](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Abug+-label%3Awaiting+-label%3Alvl%3Aminor): Significant bugs (`-label:lvl:minor`) that can be fixed now (`-label:waiting`).

- [Minor bugs](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Abug+-label%3Awaiting+label%3Alvl%3Aminor), [plugin bugs](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Abug+label%3APlugin+),  and [minor enhancements](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Aenhancement+-label%3Awaiting+label%3Alvl%3Aminor+).  I am responsible for fixing only "core" plugins such as viewrendered.py and mod_scripting.py.

- [5.8.1](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+milestone%3A5.8.1+-label%3Awaiting+-label%3Apip+is%3Aopen+): Programming issues scheduled for 5.8.1.

- [5.8.2](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+milestone%3A5.8.2+is%3Aopen+-label%3Apip): Programming issues scheduled for 5.8.2.

**Links to Labels**

- [Bug](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Abug+): All bugs. Borderline issues can have both the "Bug" and "Enhancement" labels.

- [Best](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Abest): The most significant enhancement issues.

- [EKR](https://github.com/leo-editor/leo-editor/issues?q=is%3Aissue+label%3AEKR+is%3Aopen): Issues in which I have (or have had) a particular interest. This label is no guarantee of action.

- [Info](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+label%3Ainfo): All Info items.

- [Summary](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Asummary): All Summary items. *Note*: Summary items are also Info items.

- [Importers](https://github.com/leo-editor/leo-editor/issues?q=is%3Aissue+import+is%3Aopen+label%3AImporter): Importer items are the most important links between Leo and other programs.

- [Emacs](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Aemacs): Issues related to Emacs. Some issues have been downgraded in importance recently.

- [Gui](https://github.com/leo-editor/leo-editor/issues?q=is%3Aissue+label%3Agui+is%3Aopen): Items related to Leo's gui code, especially window code. **These items won't happen unless someone else volunteers to do them**.

- [Maybe](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Amaybe): Open issues which I have no present commitment to do.

- [Won'tDo](https://github.com/leo-editor/leo-editor/issues?utf8=%E2%9C%93&q=is%3Aissue+label%3Awon%27tdo): Closed issues which I am unlikely to do, absent a significant change of opinion.

  Many of my own issues have been relegated to the "Won'tDo" category. I am no longer interested in slavishly adding features from other editors to Leo, nor am I interested in embedding Leo in other editors.

**The "Maybe" and "Won'tDo" labels are not necessarily death sentences.** I often change my mind. Comments are not allowed here. Please argue for an issue in [leo-editor](https://groups.google.com/forum/#!forum/leo-editor), or in the issue itself.
.. @+node:ekr.20200114052814.1: ** Backup of first comment of #1440
@language rest
@wrap

[leoAst.py](https://github.com/leo-editor/leo-editor/blob/fstrings/leo/core/leoAst.py) unifies python's token-oriented and ast-oriented worlds. This project is intended to be a major contribution to python's tool set.

**Overview**

The **TokenOrderGenerator** (TOG) class in leoAst.py creates the following data:

- A **children** array from each ast node to its children. Order matters!
- A **parent** link from each ast.node to its parent.
- Two-way links between tokens in the **token list**, a list of Token objects, and the [ast](https://docs.python.org/3/library/ast.html) nodes in the **parse tree**:
  - For each token, **token.node** contains the ast.node "responsible" for the token.
  - For each ast node, **node.first_i** and **node.last_i** are indices into the token list.
    These indices give the range of tokens that can be said to be "generated" by the ast node.

These links promise to collapse the complexity of any code that changes text, including the [asttokens](https://pypi.org/project/asttokens/), [fstringify](https://pypi.org/project/fstringify/), and [black](https://pypi.org/project/black/) projects.

This project is a general solution to frequently-asked questions such as [this](https://stackoverflow.com/questions/16748029). It fills gaps in python's ast module, as discussed in [this python issue](https://bugs.python.org/issue33337).

**Project details**

Leo's [fstrings branch](https://github.com/leo-editor/leo-editor/tree/fstrings) contains this work, as well as work on Leo's fstringifier command. This branch will be merged into [devel](https://github.com/leo-editor/leo-editor/tree/devel) from time to time.

leoAst.py is completely independent of [Leo](http://leoeditor.com/) itself. leoAst.py contains a complete suite of unit tests. Those unit tests completely cover the most important classes of the file.

Work is in progress on distributing leoAst.py as a separate project. In the meantime, you may find the latest version of leoAst.py [here](https://github.com/leo-editor/leo-editor/blob/fstrings/leo/core/leoAst.py).

**Status**

*The project is a complete success*.  The code in leoAst.py is simpler, easier to understand, more flexible, more robust and faster than the corresponding code in the asttokens, fstringify and black projects. See the following section for more details.

leoAst.py contains an entirely new implementation of the [fstringify](https://pypi.org/project/fstringify/) tool. This new fstringify is simpler, faster, more complete, more flexible and more reliable than fstringify itself.

leoAst.py also contains an entirely new implementation of [black](https://pypi.org/project/black/). This code is under development. Basing the code on the TOG class will surely make the tool simpler, more flexible and faster than black itself.

**Figures of merit**

**Simplicity**: The code (all of it) is the simplest thing that could possibly work. This is in stark contrast to the generators used in the [asttokens](https://pypi.org/project/asttokens/), [fstringify](https://pypi.org/project/fstringify/), and [black](https://pypi.org/project/black/) projects.

**Flexibility**: Flexibility comes from simplicity, not special cases. The code contains no hacks of any kind. Again, this is in stark contrast to the generators used by asttokens, fstringify and black.

**Speed**: The TOG creates two-way links between tokens and ast nodes in roughly the time taken by python's tokenize.tokenize and ast.parse library methods. This is substantially faster than the  [asttokens](https://pypi.org/project/asttokens/), [fstringify](https://pypi.org/project/fstringify/), and [black](https://pypi.org/project/black/) tools. The TOT class traverses trees annotated with parent/child links even more quickly.

**Memory**: The TOG class makes no significant demands on python's resources. Generators add nothing to python's call stack. TOG.node_stack is the only variable length data. This stack resides in python's heap, so its length is unimportant. In the worst case, it might contain a few thousand entries. The TOT class uses no variable-length data whatever.

**Innovation**: The code's speed, simplicity, robustness and flexibility is the result of months of work.  For more details, see the Project's history in [this issue's third comment](https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-574145510).

**To do**

Infrastructure:
- [x] Create Token, Tokenizer, TokenOrderGenerator and TokenOrderTraverser classes.
- [x] Create a set of dumping tools.
- [x] Create nearest_common_ancestor and tokens_for_node functions.

Testing:
- [x] tokenizer.check_results verifies token round-tripping.
- [x] tog.sync_tokens is an ever-present unit test that verifies that the TOG visits nodes in token order.
- [x] Create unit tests that cover the TOG, TOT and Fstringify classes.

Real world tools:
- [x] Create the Fstringify class.
- [x] Create fstringify-files, diff-fstringify-files and silent-fstringify-files commands.
- [x] Rewrite Leo's beautify-files and diff-beautify-files commands using the Orange class.
- [ ] Complete the Orange class.
.. @+node:ekr.20200114055535.1: ** Backup of second comment of #1440: Theory of operation
@language rest
@wrap


**Theory of Operation**

This is the Theory of Operation for the TokenOrderGenerator (TOG) class and related classes. These classes should be a substantial legacy to the python community.

The notion of a **token order traversal** of a parse tree is the foundation of this project. This is the traversal order that generates tokens in the same order that they appear in the token list corresponding to the parse tree. This project proves that token order traversals exist, and that the notion of a token order traversal is well defined.

**Token-order classes**

The **TokenOrderGenerator** (TOG) class injects two-way links between all tokens and the corresponding ast nodes. The TOG class also injects parent/child links into all ast nodes.

The TOG class defines generators that visit ast nodes in **token order**, the traversal order that corresponds to the order of the tokens produced by python's tokenizer module. The only way to ensure this correspondence is to use separate visitors for all ast nodes. All visitors are straightforward generators.

TOG visitors `yield from` tog.gen* generators to visit subtrees. All such generators eventually call TOG.sync_token, which checks that that tokens are, in fact, visited in the correct order. TOG.sync_token is an ever-present unit test.

The **TokenOrderTraversal** (TOT) class uses the parent/child links created by the TOG class. TOT.traverse contains a single for-loop that calls all nodes of the parse tree in token order. This loop is extremely fast. Using the TOT class, client code can easily modify the token list or parse tree as desired.

**Significant vs insignificant tokens**

The distinction between **significant** and **insignificant** tokens is a crucial invention of this project. Visitors call TOG.gen_token, TOG.gen_op, etc. *only* for significant tokens.

Commas, semicolons, whitespace and parentheses are *insignificant* tokens. All other kinds of tokens are significant.
 
These definitions are anything but arbitrary. Visitors can't know, just by looking at the parse tree, whether the input contains *insignificant* tokens.  For example, the source tokens might contain non-essential parentheses, or optional trailing commas, or whether two statements are separated by a semicolon.

**Helping TOG visitors**

The TOG.do_If visitor calls **TOG.find_next_significant_token** to determine whether TOG.do_If should generate an "if" or an "elif" token. This help is essential, because the following two source files generate identical parse trees!
```python
  if 1:            if 1:
      pass             pass
  else:            elif 2:
      if 2:            pass
          pass
```
Similarly, the TOG.do_Str and TOG.do_JoinedStr visitors call **TOG.get_concatenated_string_tokens** to handle one ore more concatenated string tokens.

Finally, TOG.do_slice calls TOG.find_next_significant_token to determine whether a slice without a step contains an optional colon.

TOG.find_next_significant_token and TOG.get_concatenated_string_tokens are crucial inventions. TOG class would not be possible without them.

**Syncing tokens**

**TOG.px** is an index into the token list. It is either -1, or it points at the previous significant token. 
*Note*: TOG.find_next_significant_token and TOG.get_concatenated_string_tokens use TOG.px, but never change TOG.px.

TOG.sync_token(self, kind, val) associates tokens with ast nodes as follows:

1. If (kind, val) denote an *insignificant* token, TOG.sync_token does nothing.
   
2. Otherwise, (kind, val) denotes a *significant* token. TOG.sync_token associates that token, *plus* all previous *insignificant* tokens with self.node, the ast node presently being visited.
   
In addition, if (kind, val) denotes a *significant* token, TOG.sync_token checks that the next *significant* token in the token list has the expected kind and value. This is done as follows:

- TOG.sync_token advances TOG.px to point at the next significant token, call it T.
- TOG.raises AssignLinksError if (T.kind, T.value) != (kind, val)

To summarize token syncing:

- The TOG.px index tracks the last-seen *significant* token.
- TOG.px advances monotonically through the token list.
- TOG.find_next_significant_token and TOG.get_concatenated_string_tokens scan forward through the token list using a private copy of TOG.px. These methods never change TOG.px itself.
- This token-syncing machinery is the *simplest* thing that could possibly work. It is also the *fastest* thing that could possibly work.
.. @+node:ekr.20200114071235.1: ** Backup of third comment of #1440: Project history
@language rest
@wrap

**Project history**

EKR's posts record the many Aha's there have been along the way:

Part 1: Initial steps.

- Nov. 10, 2019: [ENB: The unification of the token and ast worlds](https://groups.google.com/forum/#!topic/leo-editor/FZYJmbtRBWs) contains the initial goal.
- Nov 13, 2019: [ENB: flattening the TokenOrderTraverser class](https://groups.google.com/d/msg/leo-editor/2vXq5F-rsdQ/vKYHUGsqAwAJ).
- Nov 15, 2019: [ENB: Two Ahas re unifying the token and parse-tree worlds](https://groups.google.com/d/msg/leo-editor/6JfTBs7i73o/CWZVHXV_AwAJ).
- Nov 16, 2019: [ENB: still a juicy project](https://groups.google.com/d/msg/leo-editor/malKiU__CUY/0WJJcIHXAwAJ).
- Nov 16, 2019: [ENB: The intuition wars are over](https://groups.google.com/d/msg/leo-editor/cw77W5XEgDw/gDrN6nrkAwAJ). TOG clearly can work in linear time.
- Nov 18, 2019: [Success! High-level overview and status report](https://groups.google.com/d/msg/leo-editor/PsB8dTr2Wac/Yh49dHh6BAAJ).
- Nov 19, 2019: [The Linker class has collapsed!](https://groups.google.com/d/msg/leo-editor/n0nRwcVHFQo/87Jx_EwDBAAJ). Eventually the linker class completely disappeared...
- Nov 22, 2019: [Brief status report; long ENB re Ahas and code](https://groups.google.com/d/msg/leo-editor/x3F8q_Gxvws/jP_13dGOAAAJ). At this point I knew that the do_If visitor could not distinguish between "else if" and "elif".

Part 2: A stupendous collapse in complexity.

Nov 23, 2019 was the most momentous day in this project's history. [ENB: Wow: the entire project will be nothing but two parallel generators](https://groups.google.com/d/msg/leo-editor/_tvE6lwDKHg/mMQ0GufMAAAJ):

**First Aha**: "Neither the results list nor the Linker class is needed! This is a stunning collapse in complexity....Linker.set_links only sees significant tokens.  It could migrate to tog.put [which later became tog.sync_token]." This was followed by a lengthy discussion of details.

**Second Aha**: "That was the end of a long day's work.  However, as I was talking to Rebecca about the success, all the pieces fell into place.

"I'll never forget it.  I was brushing my teeth and saying wow.  A pause.  Another wow.  Another pause.  Another wow. I realized that the "if-list" should be a generator, that a token generator could yield all significant tokens, and that the tree and token generators would work in parallel."

"I explained it to Rebecca as follows. I held both arms over my head.  My right hand represented the tree generator.  It moved back and forth as it moved from top to bottom, simulating traversing the tree.  My left hand represented the token generator.  It moved straight down, because it was just moving from one token to the next in the token list. Crucially, the height of the two hands always matched. The generators work in parallel."

**Third Aha**: Later that day, I wrote [this](https://groups.google.com/d/msg/leo-editor/_tvE6lwDKHg/ph13Da4iAQAJ):

"Doh. The so-called generators are nothing but indices into the token list. There is no need for helper lists! peek and advance need only maintain an integer index into the token list.  This index monotonically advances, guaranteeing O(N) performance."

Part 3: Final details:

- Nov 27, 2019: [ENB: The last puzzle](https://groups.google.com/d/msg/leo-editor/FQRbeq5HJdI/RrIJJu4OAgAJ). At this point I was still trying to use the parse tree as the primary data.
- Dec 5, 2019: [The unification of tokens and parse trees is complete].
- Dec 5, 2019: [Status report and a big Aha](https://groups.google.com/d/msg/leo-editor/wFM8TyIN-9s/Cx0ZcX59BAAJ):

"A giant Aha: We can determine which 'string' tokens are concatenated just by looking at the token list!!! Indeed, 'string' tokens are concatenated if and only if there are no significant tokens (including parens) between them!"
.. @+node:ekr.20161022035203.1: ** Test code: do not delete
@language python
# This tree contains clones. None are contained in any external file.
.. @+node:ekr.20161006162035.1: *3* cff regex pattern to find section references
# This works
<<(\s*)(\w+)(\s+)(\w+)(.*)>>

# These don't work
<<(\s*)(?!(import|docstring|includes))(\w+)(\s*)>>
<< xyz >>
<< import >>
.. @+node:ekr.20180213054048.1: *3* clean recent files test
self = g.app.recentFilesManager
result = [z for z in self.recentFiles if g.os_path_exists(z)]
if result != self.recentFiles:
    for path in result:
        self.updateRecentFiles(path)
    self.writeRecentFilesFile(c, force=True)
.. @+node:ekr.20180125040406.1: *3* script: clear g.app.db['shown-tips']
g.app.db ['shown-tips'] = []
.. @+node:ekr.20170206165145.1: *3* script: test demo.py
g.cls()
# c.frame.log.clearLog()
if c.isChanged(): c.save()
import imp
from leo.core.leoQt import QtGui
import leo.plugins.demo as demo
imp.reload(demo)
table = [
'''\
demo.delete_widgets()
demo.callout('Callout 1 centered')
demo.subtitle('This is subtitle 1')
''',
'''\
demo.delete_widgets()
demo.callout('Callout 2 (700, 200)', position=[700, 200])
demo.subtitle('This is subtitle 2')
''',
'''\
demo.delete_widgets()
demo.callout('Callout 3 (200, 300)', position=[200, 300])
demo.subtitle('This is subtitle 3')
''',
'''\
demo.delete_widgets()
demo.callout('Callout 4 (center, 200)', position=['center', 200])
demo.subtitle('This is a much much longer subtitle 4')
''',
'''\
demo.delete_widgets()
demo.callout('Callout 5 (700, center)', position=[700, 'center'])
demo.subtitle('Short 5')
''',
'''\
demo.delete_widgets()
demo.next()
''',
]
color = QtGui.QColor('lightblue')
sub_color = QtGui.QColor('mistyrose')
demo = demo.Demo(c, color=color, subtitle_color=sub_color, trace=False)
demo.delete_widgets()
demo.start(script_list = table)
.. @+node:ekr.20170317101032.1: *3* test g.unCamel
g.cls()

table = (
    'abcXyz',
    'AbcXyz',
    'abcXyzW',
)
for s in table:
    print(s)
    g.printList(g.unCamel(s))
.. @-all
.. @@nosearch
.. @-leo
