Reword docs: Mention that BriDoc trees are DAGs

2017-05-21 13:56:20 +02:00 · 2017-05-21 13:56:20 +02:00 · 647fa94ef3
parent 4ee44388f7
commit 647fa94ef3
5 changed files with 45 additions and 39 deletions
--- a/doc-svg-gen/Main.hs
+++ b/doc-svg-gen/Main.hs
@ -170,7 +170,7 @@ main = do
                       ["module header", "modulechildren"]

    subContext "bridocgen" $ Text.Lazy.unlines
-      [ "translation into BriDoc tree"
+      [ "translation into BriDoc tree/DAG"
      , "in (nested) monadic context"
      , "(additional) State: NodeAllocIndex"
      ]
@ -270,17 +270,17 @@ main = do
    dataTransformationLabeled "layoutSig"
                              "layoutSig\n+recursion\n(layoutType etc.)"
                              [("type of node?", Just "type sig")]
-                              ["BriDoc (tree)"]
+                              ["BriDoc (tree/DAG)"]

    dataTransformationLabeled "layoutBind"
                              "layoutBind\n+recursion\n(layoutExpr etc.)"
                              [("type of node?", Just "equation")]
-                              ["BriDoc (tree)"]
+                              ["BriDoc (tree/DAG)"]

    dataTransformationLabeled "layoutByExact"
                              "layoutByExact"
                              [("type of node?", Just "not handled (yet)")]
-                              ["BriDoc (tree)"]
+                              ["BriDoc (tree/DAG)"]

  -- backend :: Data.GraphViz.Types.Generalised.DotGraph String
  -- backend = digraph (Str ("ppm")) $ do
--- a/doc-svg-gen/generated/bridocgen.svg
+++ b/doc-svg-gen/generated/bridocgen.svg
@ -67,27 +67,27 @@
 <path fill="none" stroke="#666666" stroke-width="2" d="M204,-243.9551C204,-235.8828 204,-226.1764 204,-217.1817"/>
 <polygon fill="#666666" stroke="#666666" stroke-width="2" points="207.5001,-217.0903 204,-207.0904 200.5001,-217.0904 207.5001,-217.0903"/>
 </g>
-<!-- BriDoc (tree) -->
+<!-- BriDoc (tree/DAG) -->
 <g id="node4" class="node">
-<title>BriDoc (tree)</title>
-<polygon fill="#d3d3d3" stroke="#d3d3d3" points="259,-36 149,-36 149,0 259,0 259,-36"/>
-<text text-anchor="middle" x="204" y="-14.3" font-family="Times,serif" font-size="14.00" fill="#000000">BriDoc (tree)</text>
+<title>BriDoc (tree/DAG)</title>
+<polygon fill="#d3d3d3" stroke="#d3d3d3" points="278,-36 130,-36 130,0 278,0 278,-36"/>
+<text text-anchor="middle" x="204" y="-14.3" font-family="Times,serif" font-size="14.00" fill="#000000">BriDoc (tree/DAG)</text>
 </g>
-<!-- layoutSig&#45;&gt;BriDoc (tree) -->
+<!-- layoutSig&#45;&gt;BriDoc (tree/DAG) -->
 <g id="edge3" class="edge">
-<title>layoutSig&#45;&gt;BriDoc (tree)</title>
+<title>layoutSig&#45;&gt;BriDoc (tree/DAG)</title>
 <path fill="none" stroke="#666666" stroke-width="2" d="M104.6047,-72.9474C122.8312,-62.8715 144.0469,-51.1431 162.1855,-41.1157"/>
 <polygon fill="#666666" stroke="#666666" stroke-width="2" points="164.2999,-43.9462 171.3583,-36.0449 160.9132,-37.8199 164.2999,-43.9462"/>
 </g>
-<!-- layoutBind&#45;&gt;BriDoc (tree) -->
+<!-- layoutBind&#45;&gt;BriDoc (tree/DAG) -->
 <g id="edge5" class="edge">
-<title>layoutBind&#45;&gt;BriDoc (tree)</title>
+<title>layoutBind&#45;&gt;BriDoc (tree/DAG)</title>
 <path fill="none" stroke="#666666" stroke-width="2" d="M204,-72.9474C204,-64.5354 204,-54.9716 204,-46.2075"/>
 <polygon fill="#666666" stroke="#666666" stroke-width="2" points="207.5001,-46.0449 204,-36.0449 200.5001,-46.0449 207.5001,-46.0449"/>
 </g>
-<!-- layoutByExact&#45;&gt;BriDoc (tree) -->
+<!-- layoutByExact&#45;&gt;BriDoc (tree/DAG) -->
 <g id="edge7" class="edge">
-<title>layoutByExact&#45;&gt;BriDoc (tree)</title>
+<title>layoutByExact&#45;&gt;BriDoc (tree/DAG)</title>
 <path fill="none" stroke="#666666" stroke-width="2" d="M306.4871,-78.4905C287.7815,-67.45 263.6397,-53.2009 243.4861,-41.3057"/>
 <polygon fill="#666666" stroke="#666666" stroke-width="2" points="245.2224,-38.2664 234.8315,-36.1975 241.6643,-44.2947 245.2224,-38.2664"/>
 </g>
--- a/doc-svg-gen/generated/ppm.svg
+++ b/doc-svg-gen/generated/ppm.svg
@ -20,7 +20,7 @@
 <g id="node7" class="node">
 <title>bridocgen</title>
 <ellipse fill="none" stroke="#000000" cx="278.5" cy="-982.6432" rx="187.2667" ry="37.4533"/>
-<text text-anchor="middle" x="278.5" y="-993.9432" font-family="Times,serif" font-size="14.00" fill="#000000">translation into BriDoc tree</text>
+<text text-anchor="middle" x="278.5" y="-993.9432" font-family="Times,serif" font-size="14.00" fill="#000000">translation into BriDoc tree/DAG</text>
 <text text-anchor="middle" x="278.5" y="-978.9432" font-family="Times,serif" font-size="14.00" fill="#000000">in (nested) monadic context</text>
 <text text-anchor="middle" x="278.5" y="-963.9432" font-family="Times,serif" font-size="14.00" fill="#000000">(additional) State: NodeAllocIndex</text>
 </g>
--- a/docs/implementation/bridoc-design.md
+++ b/docs/implementation/bridoc-design.md
@ -100,12 +100,14 @@ used in exactly that manner: Both are referenced once in each of the two
 alternatives.

 Unfortunately this does not mean that we can forget this issue entirely.
-The problem is that the BriDoc tree value will get transformed by multiple
-transformations. And this "breaks" sharing: If we take an exponential-sized
-tree that is linear-via-sharing and `fmap` some function `f` on it (think of
-some general-purpose tree that is Functor) then `f` will be evaluated an
-exponential number of times. And worse, the output will have lost any sharing.
-Sharing is not automatic memoization.
+The problem is that the BriDoc tree (or maybe: rooted DAG, given that we share
+nodes) value will get transformed by multiple transformations.
+And this "breaks" sharing: If we naively traverse every path in a DAG and
+`fmap` some function `f` on it (think of some general-purpose tree/graph that
+is Functor) then `f` will be evaluated an exponential number of times, because
+our linear DAG still has an exponential amount of different paths.
+And worse, the output will have lost any sharing, so becomes a tree with an
+exponential number of nodes. Sharing is not automatic memoization.
 And this holds for BriDoc, even when the transformations are not exactly
 `fmap`s.

@ -123,12 +125,14 @@ So.. we already mentioned "memoization" there, right?
   we can abstract over that pretty well.
   
 2. The good news:
-   With manual memoization, creating an exponentially-sized tree is no
-   problem, presuming that it is linear-via-sharing. Not messing up this
-   property can take a bit of consideration - but otherwise we are set.
-   If the `BriDocF` tree is exponential, the transformations will still
-   do only linear-amount of "selection work" in order to convert into a
-   linear-sized `BriDoc` tree.
+   With manual memoization, we really work on rooted DAGs
+   (with linear amount of nodes and edges) instead of trees, because we share
+   nodes. Not messing up this property (that we always share nodes where
+   necessary) can take a bit of consideration - but otherwise we are set.
+   Transformations on this DAG can be expressed in such a way that they only
+   require a linear amount of work, and our first transformation will output
+   a (linear-sized) tree, so there is relatively little code that needs to
+   handle a DAG.

   This property is the defining one that motivates the BriDoc
   intermediate representation.
@ -161,7 +165,7 @@ The `BriDocF f` type encapsulates the idea that each subnode is wrapped
 in the `f` container. This notion gives us the following nice properties:

 `BriDocF Identity ~ BriDoc` and `BriDocF ((,) Int)` is the
-manual-memoization tree with labeled nodes. Abstractions, abstractions..
+manual-memoization tree/DAG with labeled nodes. Abstractions, abstractions..

 Lets have a glance at related code/types we have so far:

@ -169,8 +173,8 @@ Lets have a glance at related code/types we have so far:
 -- The pure BriDoc: What we really want, but cannot use everywhere due
 -- to sharing issues.
 -- Isomorphic to `BriDocF Identity`. We still use this type, because
-- then we have to unwrap the `Identities` only in once place after reducing
-- the tree to a non-exponentially-sized one.
+-- then we have to unwrap the `Identities` only in once place after turning
+-- the DAG into a tree (and getting rid of any exponentiality in the process).
 data BriDoc
  = BDEmpty
  | BDLit !Text
--- a/docs/implementation/theory.md
+++ b/docs/implementation/theory.md
@ -102,7 +102,7 @@ consider the circumstances for which a non-optimal solution is returned.

 ## The Reasoning

-A top-down approach is so bad, because when there are exponentially many
+A top-down approach is so bad because when there are exponentially many
 layouts to consider, there information passed down from the parents does
 not help at all in pruning the alternatives on a given layer. In the above
 `nestedCaseExpr` example, we might obtain a better solution by looking not
@ -211,14 +211,16 @@ required per node of the input.
  which abstracts of different syntactical constructs and only considers the
  things relevant for layouting. This data-type is called `BriDoc`.

- The `BriDoc` tree has an exponential number of nodes, but it is linear when
-  sharing is considered - the child-nodes can (and must) be re-used across
-  different alternatives. In the `nestedCaseExpr` example above, note how
-  there are four layouts, but essentially only two ways in which the "if" is
-  layouted. Either as a single line or with then/else on new lines. We can
-  handle spacings in such a way that we can share them for 1/3 and 2/4. This
-  already hints at how "columns used" will need to be redesigned slightly so
-  that 2/4 really have the same spacing label at the "if".
+- If we did not share values, we'd work on `BriDoc` trees of exponential size.
+  By sharing child-nodes across different alternatives we instead obtain a
+  rooted DAG of linear size, but still with an exponential number of different
+  paths.
+  In the `nestedCaseExpr` example above, note how there are four layouts, but
+  essentially only two ways in which the "if" is layouted.
+  Either as a single line or with then/else on new lines. We can handle
+  spacings in such a way that we can share them for 1/3 and 2/4.
+  This already hints at how "columns used" will need to be redesigned slightly
+  so that 2/4 really have the same spacing label at the "if".

 ### Concessions/non-Optimality