Add option to generate summaries of each note and index them. #43

justyns · 2024-07-13T22:52:12Z

Extension to #34

I'm not sure about this yet.

The general idea is to generate a short one-paragraph summary of each note, generate embeddings for it, and index it. This is in addition to the recently added embeddings index that is per paragraph.

So far I'm testing out the phi-3 model with ollama and some summaries are okay, but a lot include random hallucinations.

I'm probably going to add this to the AI: Search page and merge it as an experimental feature for now, but the prompt will definitely need some tweaking, and will need to test other local models.

edit:

I'm merging this in with the setting off by default (like the normal embeddings). I had much better luck using gemma2 as the model to generate summaries. It does make indexing take quite a bit longer though when using a locally hosted model.

I also added some in-memory caching so that repeatedly editing a page doesn't cause the same thing to be regenerating over and over.

… embeddings

zefhemel · 2024-07-14T10:19:44Z

I'd be interested in your experience using various local models. I haven't had time for this myself yet, but I had some hopes for phi-3 because it seemed quite small yet well performing. Perhaps at some point you can document this on the plug's website as well? Would be good to give people some guidance if you have it.

justyns · 2024-07-15T01:00:25Z

for sure! I don't have a ton of experience with local models either, but I'm hoping others will eventually chime in too.

I was hoping to use phi-3 too, but it would generate weird stuff sometimes.

As an example, here's a note I was testing with:

Lunar is a cat. He’s only 2 years old, and enjoys walking with his leash.
Lunar is a fluffy kitty, and we love him very much.
He recently got a new collar that he is really proud of. It has fish on it.

gemma2 generates:

Lunar is a two-year-old, fluffy cat who enjoys walking on a leash and has a new collar with fish on it.

whereas phi-3 returns this:

Lunar is a well-loved, fluffy two-year-donkey kitty who enjoys leash walking adventures with his human companions. Known for their bond and shared activities like strolling outside safely restrained by a harness rather than traditional pet collars, these feline friends bring joy to those they touch in the small community where Lunar resides.

Note the "two-year-donkey" and extra descriptions not in the original note. I had some other test notes that added random things like that too. They're mostly passable, and I suspect could be better with better prompting with some examples, but gemma2 was nicer out of the box.

justyns · 2024-07-15T01:01:57Z

I haven't figured out the best way to do it yet, but I'm wanting to set up some sort of benchmark for local models related to silverbullet-ai or at least related to notes in general.

Nothing too complex, but something easy enough that I can plug a bunch of models in and then compare the results of different commands against each other.

justyns added 6 commits July 13, 2024 17:49

Add option to generate summaries of each note and index them.

b67c5c9

Search summary embeddings and show on ai search page

c43d667

Add an in-memory cache for embeddings to avoid re-generating the same…

bcfe89e

… embeddings

Move caching to the embedding provider so it is used everywhere

314ffeb

Add note summary generation/search to docs

18e74d0

Add _generateEmbeddings to gemini

ab486ba

justyns marked this pull request as ready for review July 14, 2024 09:07

justyns merged commit 336632e into main Jul 14, 2024
3 checks passed

justyns deleted the note-excerpts branch July 14, 2024 09:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to generate summaries of each note and index them. #43

Add option to generate summaries of each note and index them. #43

justyns commented Jul 13, 2024 •

edited

Loading

zefhemel commented Jul 14, 2024

justyns commented Jul 15, 2024

justyns commented Jul 15, 2024

Add option to generate summaries of each note and index them. #43

Add option to generate summaries of each note and index them. #43

Conversation

justyns commented Jul 13, 2024 • edited Loading

zefhemel commented Jul 14, 2024

justyns commented Jul 15, 2024

justyns commented Jul 15, 2024

justyns commented Jul 13, 2024 •

edited

Loading