No description

TypeScript 100%

Find a file

Andy Burke 5b75b57b6e chore: bump version		2026-06-16 13:23:54 -07:00
.zed	refactor: add a zed settings file	2025-11-03 18:58:47 -08:00
extern/btree	feature: btree indexer	2026-06-12 18:54:20 -07:00
indexers	fix: accept `to_many` on btrees even though they all work as to-many	2026-06-16 11:57:28 -07:00
organizers	fix: try to resolve some issues with rebuilding indexes	2026-04-17 13:21:18 -07:00
tests	feature: btree indexer	2026-06-12 18:54:20 -07:00
utils	feature: btree indexer	2026-06-12 18:54:20 -07:00
.gitignore	feature: initial commit	2025-06-13 20:40:28 -07:00
cli.ts	fix: try to resolve some issues with rebuilding indexes	2026-04-17 13:21:18 -07:00
deno.json	chore: bump version	2026-06-16 13:23:54 -07:00
DEVELOPMENT.md	feature: btree indexer	2026-06-12 18:54:20 -07:00
fsdb.ts	fix: filter out btree files	2026-06-16 13:23:34 -07:00
indexers.ts	feature: btree indexer	2026-06-12 18:54:20 -07:00
README.md	docs: update README	2026-06-12 19:57:02 -07:00

README.md

F(ile)S(ystem) D(ata)B(ase)

Store your data as files on disk, indexed for fast lookup as browsable symlinks and/or persistent btrees.

Why

Could a database be made as simple as a directory on a filesystem?

JSON stored in files you can grep, git diff, back up, and read with any text editor.

Indexes you can browse directly on the disk or open in a text editor.

No complex deployment requirements with remote hosts, etc.

Not for everything, or everyone.

Quick Start

import * as fsdb from '@andyburke/fsdb';
import { FSDB_INDEXER_SYMLINKS, FSDB_INDEXER_BTREE } from '@andyburke/fsdb/indexers';
import { by_character, by_email, by_phone } from '@andyburke/fsdb/organizers';

type USER = {
	id: string;
	username: string;
	email: string;
	phone: string;
};

const users = new fsdb.FSDB_COLLECTION<USER>({
	name: 'users',
	root: './data/users',
	indexers: {
		username: new FSDB_INDEXER_SYMLINKS({
			name: 'username',
			field: 'username',
			organize: by_character,
		}),
		email: new FSDB_INDEXER_SYMLINKS({
			name: 'email',
			field: 'email',
			organize: by_email
		}),
		phone: new FSDB_INDEXER_BTREE({
			name: 'phone',
			field: 'phone',
			organize: by_phone
		})
	}
});

// create an item
const user = await users.create({ id: 'able-fish-door', email: 'alice@example.com', phone: '213-555-1234' });

// look up by id
const found = await users.get('able-fish-door');

// search via symlink index -- browsable on disk
const by_email_results = await users.find({ email: 'alice@example.com' });

// search via BTree index -- supports prefix matching -- you can open the persisted json in a text editor
const by_phone_prefix = await users.find({ phone: '213', substring: true });

// Delete
await users.delete(found);

API

Each collection is a FSDB_COLLECTION<T> that manages items of type T on disk.

Core Methods

`collection.create(item)` -- stores an item and returns it
`collection.get(id)` -- retrieves an item by its id, or null if not found
`collection.update(item)` -- updates an existing item (id must be present)
`collection.delete(item)` -- removes an item from disk and cleans up indexes
`collection.all([options])` -- iterates over every item in the collection

Search Methods

`collection.find(criteria, [options])` -- finds items matching indexed fields
`collection.on(event, handler)` -- subscribes to lifecycle events
`collection.off(event, handler)` -- unsubscribes from events

The find method takes an object whose keys must correspond to indexer names:

// exact match on the email indexer
const results = await users.find({ email: 'alice@example.com' });

// multiple criteria -- must match all provided indexers
const results = await users.find({ email: 'alice@example.com', phone: '213-555-1234' });

// with pagination
const results = await users.find({ email: 'alice@example.com' }, { limit: 10, offset: 20 });

Events

Events fire for every operation with { item, item_path } in the payload:

create, update, get, delete, write, index, all, find

Indexers

Indexers let find() work without scanning every file on disk. You can mix and match indexer types in a single collection.

Symlink Indexer

Creates symlinks that organize indexed values on disk alongside the data. Good for exact lookups and human browsing.

import { FSDB_INDEXER_SYMLINKS } from '@andyburke/fsdb/indexers';
import { by_email, by_character } from '@andyburke/fsdb/organizers';

// single-field indexer
new FSDB_INDEXER_SYMLINKS({
	name: 'email',
	field: 'email',
	organize: by_email,
});

// multi-value indexer (split one field into multiple index entries)
new FSDB_INDEXER_SYMLINKS({
	name: 'keywords',
	get_values_to_index: (user) => user.bio.split(/\W/).filter((word) => word.length > 3),
	to_many: true,
	organize: by_character,
});

On disk, a symlink indexer creates:

<root>/
  .indexes/<indexer_name>/          <-- indexed value hierarchy
    <tld>/                            <-- organized path segments
      <value>/                        <-- e.g., "example.com"
        <sanitized_value>.json        <-- symlink to the item file
  <item_dir>/
    <item>.json                       <-- the actual data file
    .index.symlink.<name>.<value>     <-- reverse symlink back to index

BTree Indexer

An in-memory sorted map persisted as JSON. Enables fast lookups and prefix (substring) search on indexed values. The tree loads from disk automatically when the collection is constructed.

import { FSDB_INDEXER_BTREE } from '@andyburke/fsdb/indexers';
import { by_character } from '@andyburke/fsdb/organizers';

new FSDB_INDEXER_BTREE({
	name: 'name',
	field: 'name',
	organize: by_character,
});

// prefix search -- finds all indexed values starting with "ali"
const results = await collection.find({ name: 'ali' });

On disk, BTree data lives at: <root>/.indexes.btree/<indexer_name>.btree.json

Comparing Indexers

	Symlink	BTree
How it works	Creates symlinks on disk	In-memory sorted map persisted as JSON
Lookup speed	Disk traversal per indexer	O(log n) lookup, no disk I/O at query time
Substring search	No	Yes
Best for	Fields you want to browse on disk (email, phone, names)	Fields you only search programmatically (IDs, keywords, prefixes)

You can use both in the same collection. FSDB will keep them all in sync automatically.

Organizers

An organizer is a function that takes a string and returns path segments. FSDB ships with several built-in organizers:

`by_character` -- one directory per character, up to three, then the full name: `abcdefg.json` → `/a/ab/abc/abcdefg/abcdefg.json`
`by_email` -- organized by tld, domain, then value
`by_phone` -- organized by country code, area code, etc.
`by_lurid` -- organized using a lurid id (default)
`flat` -- flat file listing with no subdirectories

CLI

fsdb <collection> create <json>      create a new item
fsdb <collection> get <id>           retrieve an item by id
fsdb <collection> update <json>      update an existing item
fsdb <collection> delete <json>      delete an item

Environment Variables

FSDB_ROOT -- base directory for all data (default: `./.fsdb`)
FSDB_PERF -- set to enable performance timing output
FSDB_LOG_EVENTS -- set to log the event system to stdout
FSDB_TEST_DATA_STORAGE_ROOT -- test data directory (tests only)