Commit graph

102 commits

Author SHA1 Message Date
8605f8d43f Implement leon-wiki-graph command 2022-11-01 01:22:15 +01:00
60ba7721db Add longest-shortest-path command 2022-10-23 01:16:08 +02:00
1265dd4a41 Move some functions to util 2022-10-23 01:14:31 +02:00
a7b9849183 Add --flip flag to path command 2022-10-22 20:46:26 +02:00
d5f55d2855 Print more stuff 2022-10-22 19:58:19 +02:00
8bb94b1847 Allow redirects to have 0 links 2022-10-22 19:54:08 +02:00
2e6539cbc5 Fix redirect importing 2022-10-22 19:38:35 +02:00
e91a2db1b1 Allow specifying redirects as path start and end 2022-10-22 19:38:27 +02:00
0e3d61d632 Add list-pages command 2022-10-22 19:25:01 +02:00
c40153be9f Try out different costs 2022-10-22 19:14:16 +02:00
d99b3d49e0 Refactor changing data 2022-10-22 19:14:03 +02:00
32b72c10c8 Assign redirects a cost of 0 2022-10-22 18:48:42 +02:00
d1a80a6ae6 Print redirects differently 2022-10-22 18:41:47 +02:00
0d4087fdde Detect if no path exists 2022-10-22 18:41:40 +02:00
179e6b74a5 Implement simple dijkstra 2022-10-22 17:14:47 +02:00
8b62ff78bd Prepare dijkstra in path command 2022-10-22 16:23:35 +02:00
67f405a21e Make data representation more flexible 2022-10-22 15:52:07 +02:00
49b27715f0 Print duplicate page map entries 2022-10-22 15:40:45 +02:00
786b180b09 Add imhex patterns 2022-10-22 01:21:59 +02:00
853e09517f Add unfinished path command 2022-10-22 01:21:59 +02:00
345462915b Change AdjacencyMap associated data 2022-10-22 01:21:59 +02:00
5656f65b6c Refactor ingestion 2022-10-22 01:21:59 +02:00
3296f6d15a Fix page link_idx computation 2022-10-22 00:05:15 +02:00
a9435e4f64 Lowercase only first char when normalizing 2022-10-22 00:01:04 +02:00
3a75089e5a Make adjacency list extensible 2022-10-21 20:39:53 +02:00
78aa27c019 Add more checks 2022-10-21 19:53:15 +02:00
23463522f0 Don't print escape characters directly 2022-10-04 21:47:43 +02:00
f71092058b Refactor export and add page length 2022-10-03 22:14:58 +02:00
d910047b48 Perform consistency check when reexporting 2022-10-03 18:11:51 +02:00
e74eee89e6 Add reexport command 2022-10-03 18:07:30 +02:00
266f001d46 Move commands to own module 2022-10-03 18:04:24 +02:00
969fd01914 Export links to custom binary format 2022-10-03 18:01:15 +02:00
0e0789cc4d Ingest new json format 2022-10-03 17:36:08 +02:00
78a5aa5169 Ignore all namespaces except 0 2022-10-03 16:26:08 +02:00
ecdeb4086a Make json format more consistent 2022-10-03 15:00:23 +02:00
51096c99e1 Make stored data more compact 2022-10-01 01:49:01 +02:00
f6bcb39c52 Import data and check consistency 2022-09-30 19:53:41 +02:00
1ea09a9be9 Export data to CBOR 2022-09-30 19:50:02 +02:00
499642cda9 Convert first stage data into proper adjacency list 2022-09-30 19:30:47 +02:00
11c4ff699f Try out faster hash algorithm 2022-09-30 19:02:57 +02:00
5e8589f73e Load input into adjacency-list-like structure 2022-09-30 18:53:56 +02:00
b1f2af9577 Use simd-json 2022-09-30 18:07:50 +02:00
c195fbb8d4 Load sift data from stdin 2022-09-30 18:07:31 +02:00
2e2045a74d Set up brood subproject 2022-09-30 16:01:09 +02:00
d29e7257ba Include namespace in info 2022-09-30 02:19:29 +02:00
7cf5b013da Handle revisions without text 2022-09-30 01:34:06 +02:00
1db581725b Handle redirects 2022-09-30 01:18:51 +02:00
23c7df3c43 Elaborate on sift 2022-09-30 01:18:41 +02:00
73064ea2b0 Extract links from articles 2022-09-30 00:39:44 +02:00
fe1db32c0e Iterate through pages in dump 2022-09-30 00:09:49 +02:00