Document commit ordering

2023-10-23 19:44:14 +02:00 · 2023-10-23 19:44:14 +02:00 · 848c4ad019
commit 848c4ad019
parent c9ff8ab228
1 changed files with 38 additions and 17 deletions
--- a/scripts/graph.ts
+++ b/scripts/graph.ts
@ -12,30 +12,30 @@ The graph should be fast. This requires a bit of careful thinking around data
 formats and resource usage.
 My plan is to get as far as possible without any sort of pagination or range
-limits. The plot should always display data for the repo's entire history
+limits. The graph should always display data for the repo's entire history
-(unless zoomed in). This will force me to optimize the entire pipeline. If
+(unless zoomed in). This will force me to optimize the entire pipeline. If the
-the result is not fast enough, I can still add in range limits.
+result is not fast enough, I can still add in range limits.
 uPlot is pretty fast at rendering large amounts of data points. It should be
-able to handle medium-sized git repos (tens of thousands of commits)
+able to handle medium-sized git repos (tens of thousands of commits) displaying
-displaying multiple metrics with no issues. The issue now becomes retrieving
+multiple metrics with no issues. The issue now becomes retrieving the data from
-the data from the server.
+the server.
-Since the graph should support thousands of metrics, it can't simply fetch
+Since the graph should support thousands of metrics, it can't simply fetch all
-all values for all metrics upfront. Instead, it must fetch metrics as the
+values for all metrics upfront. Instead, it must fetch metrics as the user
-user selects them. It follows that when fetching a metric,
+selects them. It follows that when fetching a metric,
 1. the server should have little work to do,
 2. the amount of data sent over the network should be small, and
 3. the client should have little work to do.
-The costs when initially loading the graph may be higher since it happens
+The costs when initially loading the graph may be higher since it happens less
-less frequently. We can fetch some more data and do some preprocessing to
+frequently. We can fetch some more data and do some preprocessing to improve
-improve performance while interacting with the graph.
+performance while interacting with the graph.
-Since we fetch data across multiple requests, we need some way to detect if
+Since we fetch data across multiple requests, we need some way to detect if all
-all the data we have is consistent (at least in cases where things might
+the data we have is consistent (at least in cases where things might otherwise
-otherwise break).
+break).
 Implementation
 ==============
@ -49,7 +49,8 @@ The data for the graph consists of three main parts:
 Data consistency
 ----------------
-Each response by the server includes a graph id and a data id.
+Responses by the server include a graph id (for 2. and 3.) and a data id (for 1.
 and 3.).
 The graph id is incremented when the commit graph structure changes. Responses
 to 2. and 3. MUST have the same graph id. When they don't, the client must
@ -67,7 +68,7 @@ Data flow
 │ /graph/metrics │   │ /graph/measurements │    │ /graph/commits │
 └──┬─────────────┘   └──┬──────────────────┘    └──────────┬─────┘
   │      Server        │                                  │
-───┼────────────────────┼─────────Requests─────────────────┼──────
+───┼────────────────────┼──────────────────────────────────┼──────
   │                    │                                  │
 ┌──▼─────┐    ┌─────────▼────┐  ┌─────────────────┐  ┌─────▼─────┐
 │ metric │    │ measurements │  │ permute by-hash ◄──┤commit info│
@ -90,6 +91,26 @@ Data flow
 │ selector │  │     checkbox    │  └──────┘
 └──────────┘  └─────────────────┘
 Commit orders
 -------------
 There are two main orders for commits in this implementation.
 The first order ("by hash") is simply sorting by commit hash. All data returned
 by the server is in this order. It easy for the server to retrieve data in this
 order, and the order is unambiguous.
 The second order ("by graph") is the order used to display commits in the graph.
 Points in the graph must be ordered in ascending order along the x axis or uPlot
 will produce graphical glitches. For this graph, the x value is either their
 exact committer time or, in day-equidistant mode, the committer day and a unique
 offset per commit in the same day.
 Multiple commits may share exactly the same committer time. To break such ties
 in day-equidistant mode, the commits are ordered in topological order. While
 this change is only relevant in day-equidistant mode, we can reuse the same
 ordering for both display modes, so this is also relevant to the normal mode.
 */
 // https://sashamaps.net/docs/resources/20-colors/