- TypeScript 100%
| .zed | ||
| extern/btree | ||
| indexers | ||
| organizers | ||
| tests | ||
| utils | ||
| .gitignore | ||
| cli.ts | ||
| deno.json | ||
| DEVELOPMENT.md | ||
| fsdb.ts | ||
| indexers.ts | ||
| README.md | ||
F(ile)S(ystem) D(ata)B(ase)
Store your data as files on disk, indexed for fast lookup as browsable symlinks and/or persistent btrees.
Why
Could a database be made as simple as a directory on a filesystem?
JSON stored in files you can grep, git diff, back up, and read with any text editor.
Indexes you can browse directly on the disk or open in a text editor.
No complex deployment requirements with remote hosts, etc.
Not for everything, or everyone.
Quick Start
import * as fsdb from '@andyburke/fsdb';
import { FSDB_INDEXER_SYMLINKS, FSDB_INDEXER_BTREE } from '@andyburke/fsdb/indexers';
import { by_character, by_email, by_phone } from '@andyburke/fsdb/organizers';
type USER = {
id: string;
username: string;
email: string;
phone: string;
};
const users = new fsdb.FSDB_COLLECTION<USER>({
name: 'users',
root: './data/users',
indexers: {
username: new FSDB_INDEXER_SYMLINKS({
name: 'username',
field: 'username',
organize: by_character,
}),
email: new FSDB_INDEXER_SYMLINKS({
name: 'email',
field: 'email',
organize: by_email
}),
phone: new FSDB_INDEXER_BTREE({
name: 'phone',
field: 'phone',
organize: by_phone
})
}
});
// create an item
const user = await users.create({ id: 'able-fish-door', email: 'alice@example.com', phone: '213-555-1234' });
// look up by id
const found = await users.get('able-fish-door');
// search via symlink index -- browsable on disk
const by_email_results = await users.find({ email: 'alice@example.com' });
// search via BTree index -- supports prefix matching -- you can open the persisted json in a text editor
const by_phone_prefix = await users.find({ phone: '213', substring: true });
// Delete
await users.delete(found);
API
Each collection is a FSDB_COLLECTION<T> that manages items of type T on disk.
Core Methods
`collection.create(item)` -- stores an item and returns it
`collection.get(id)` -- retrieves an item by its id, or null if not found
`collection.update(item)` -- updates an existing item (id must be present)
`collection.delete(item)` -- removes an item from disk and cleans up indexes
`collection.all([options])` -- iterates over every item in the collection
Search Methods
`collection.find(criteria, [options])` -- finds items matching indexed fields
`collection.on(event, handler)` -- subscribes to lifecycle events
`collection.off(event, handler)` -- unsubscribes from events
The find method takes an object whose keys must correspond to indexer names:
// exact match on the email indexer
const results = await users.find({ email: 'alice@example.com' });
// multiple criteria -- must match all provided indexers
const results = await users.find({ email: 'alice@example.com', phone: '213-555-1234' });
// with pagination
const results = await users.find({ email: 'alice@example.com' }, { limit: 10, offset: 20 });
Events
Events fire for every operation with { item, item_path } in the payload:
create, update, get, delete, write, index, all, find
Indexers
Indexers let find() work without scanning every file on disk. You can mix and match indexer types in a single collection.
Symlink Indexer
Creates symlinks that organize indexed values on disk alongside the data. Good for exact lookups and human browsing.
import { FSDB_INDEXER_SYMLINKS } from '@andyburke/fsdb/indexers';
import { by_email, by_character } from '@andyburke/fsdb/organizers';
// single-field indexer
new FSDB_INDEXER_SYMLINKS({
name: 'email',
field: 'email',
organize: by_email,
});
// multi-value indexer (split one field into multiple index entries)
new FSDB_INDEXER_SYMLINKS({
name: 'keywords',
get_values_to_index: (user) => user.bio.split(/\W/).filter((word) => word.length > 3),
to_many: true,
organize: by_character,
});
On disk, a symlink indexer creates:
<root>/
.indexes/<indexer_name>/ <-- indexed value hierarchy
<tld>/ <-- organized path segments
<value>/ <-- e.g., "example.com"
<sanitized_value>.json <-- symlink to the item file
<item_dir>/
<item>.json <-- the actual data file
.index.symlink.<name>.<value> <-- reverse symlink back to index
BTree Indexer
An in-memory sorted map persisted as JSON. Enables fast lookups and prefix (substring) search on indexed values. The tree loads from disk automatically when the collection is constructed.
import { FSDB_INDEXER_BTREE } from '@andyburke/fsdb/indexers';
import { by_character } from '@andyburke/fsdb/organizers';
new FSDB_INDEXER_BTREE({
name: 'name',
field: 'name',
organize: by_character,
});
// prefix search -- finds all indexed values starting with "ali"
const results = await collection.find({ name: 'ali' });
On disk, BTree data lives at: <root>/.indexes.btree/<indexer_name>.btree.json
Comparing Indexers
| Symlink | BTree | |
|---|---|---|
| How it works | Creates symlinks on disk | In-memory sorted map persisted as JSON |
| Lookup speed | Disk traversal per indexer | O(log n) lookup, no disk I/O at query time |
| Substring search | No | Yes |
| Best for | Fields you want to browse on disk (email, phone, names) | Fields you only search programmatically (IDs, keywords, prefixes) |
You can use both in the same collection. FSDB will keep them all in sync automatically.
Organizers
An organizer is a function that takes a string and returns path segments. FSDB ships with several built-in organizers:
`by_character` -- one directory per character, up to three, then the full name: `abcdefg.json` → `/a/ab/abc/abcdefg/abcdefg.json`
`by_email` -- organized by tld, domain, then value
`by_phone` -- organized by country code, area code, etc.
`by_lurid` -- organized using a lurid id (default)
`flat` -- flat file listing with no subdirectories
CLI
fsdb <collection> create <json> create a new item
fsdb <collection> get <id> retrieve an item by id
fsdb <collection> update <json> update an existing item
fsdb <collection> delete <json> delete an item
Environment Variables
FSDB_ROOT -- base directory for all data (default: `./.fsdb`)
FSDB_PERF -- set to enable performance timing output
FSDB_LOG_EVENTS -- set to log the event system to stdout
FSDB_TEST_DATA_STORAGE_ROOT -- test data directory (tests only)