Skip to content
This repository has been archived by the owner on Feb 9, 2024. It is now read-only.

Rob's Xmas present to himself: Hacking on the content resolver #13

Merged
merged 42 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
df92101
Artist search feature implemented
mayhem Dec 24, 2023
e48b4e4
Artist lb radio works!
mayhem Dec 24, 2023
ce81ee5
Refactor class to select recordings so it can be used by more than on…
mayhem Dec 25, 2023
0081aed
non-local playlists now resolve to local files.
mayhem Dec 30, 2023
7922977
Fix location issue and store full pathname in the DB
mayhem Dec 30, 2023
7805d02
Add progress bar to scan collection
mayhem Dec 30, 2023
e425bae
writing and uploading resolved playlists now works!
mayhem Dec 30, 2023
f977dda
artist, tag and stats elements now play cleanly together!
mayhem Dec 30, 2023
349f806
Add duplicate funcion to show duplicates in the collection
mayhem Dec 30, 2023
b07533e
Add missing file
mayhem Dec 30, 2023
8084a8b
Improve the cleanup function
mayhem Dec 30, 2023
e4d5c17
Fix delete
mayhem Dec 30, 2023
1d86d5f
Improve dups
mayhem Dec 31, 2023
9c74923
Finished the duplicate recording detetction feature
mayhem Dec 31, 2023
ea8505e
Improve the status update of the subsonic scan and make it faster
mayhem Dec 31, 2023
9e0325b
Minor cleanup
mayhem Dec 31, 2023
907cc99
subsonic: use getAlbumList2
phw Dec 31, 2023
5521f49
subsonic: avoid call to getAlbumInfo2 if MBID is already present
phw Dec 31, 2023
9f38115
Merge pull request #10 from phw/add-artist-element
mayhem Dec 31, 2023
8f419aa
subsonic: fix wrong variable use to read album name and artist
phw Dec 31, 2023
92f79b9
Merge pull request #11 from phw/fix-album-info
mayhem Dec 31, 2023
042956b
Make the metadata lookup suck less with proper progress bars
mayhem Dec 31, 2023
1c594ac
Show top tags after metadata load
mayhem Dec 31, 2023
c8e8241
Do not resolve playlists if no tracks are missing. Less crashy.
mayhem Dec 31, 2023
a59fa01
Update readme for the new features on this branch
mayhem Dec 31, 2023
84ab20f
Document new features
mayhem Dec 31, 2023
aad73e3
Finish updating the README
mayhem Dec 31, 2023
f91626a
First cut at periodic jams for lb local. Not a bad start!
mayhem Jan 4, 2024
0e272c0
Add the recent listens filter, which is really critical
mayhem Jan 5, 2024
8b80e21
Start tracking recordings that went unresolved
mayhem Jan 6, 2024
0aacd34
Very simple unresolved recordings report is in place
mayhem Jan 6, 2024
22fe98c
Unresolved albums report is now done
mayhem Jan 6, 2024
877d800
Filter recent listens too
mayhem Jan 7, 2024
44844b1
Improve the unresolved recordings function
mayhem Jan 8, 2024
3b29b22
db open cleanup
mayhem Jan 8, 2024
19ae013
Huh. I'm stuck
mayhem Jan 9, 2024
b16af12
Interim checkin
mayhem Jan 9, 2024
74f7a34
All features now work with filename or subsonic_id
mayhem Jan 9, 2024
f866b0d
All functions now work without config.py if you dont use subsonic
mayhem Jan 9, 2024
9b31507
Rework the index_dir and use db_files instead.
mayhem Jan 10, 2024
f2dbf1a
Make match threshold a command line arg
mayhem Jan 10, 2024
3552ab7
Merge branch 'main' into add-artist-element
mayhem Jan 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,6 @@ mp3
/build/
/dist/
config.py
*.jspf
*.m3u
.eggs
177 changes: 153 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,34 @@
The ListenBrainz Content Resolver resolves global JSPF playlists to
a local collection of music, using the resolve function.

ListenBrainz Local Radio allows you to generate tag radio playlists that
can be uploaded to your favorite subsonic API enabled music system.
The features of this package include:

## Quick Start
1. ListenBrainz Radio Local: allows you to generate radio-style playlists that
that are created using only the files in the local collection, or if that is not
possible, a global playlist with MBIDS will be resolved to a local file collection
as best as possible.

2. Periodic-jams: ListenBrainz periodic-jams, but fully resolved against your own
local collection. This is optimized for local and gives better results than
the global troi patch by the same name.

3. Metadata fetchgin: Several of the features here require metadata to be downloaded
from ListenBrainz in order to power the LB Radio Local.

4. Scan local file collections. MP3, Ogg Vorbis, Ogg Opus, WMA, M4A and FLAC file are supported.

5. Scan a remote subsonic API collection. We've tested Navidrome, Funkwhale and Gonic.

6. Print a report of duplicate files in the collection

7. Print a list of top tags for the collection

8. Print a list of tracks that failed to resolve and print the list of albums that they
belong to. This gives the user feedback about tracks that could be added to the collection
to improve the local matching.


## Installation

To install the package:

Expand All @@ -16,15 +40,58 @@ source .virtualenv/bin/activate
pip install -r requirements.txt
```

### Setting up config.py

While it isn't strictly necessary to setup config.py, it makes using the resolver easier:

```
cp config.py.sample config.py
```

Then edit config.py and set the location of where you're going to store your resolver database file
into DATABASE_FILE. If you plan to use a Subsonic API, the fill out the Subsonic section as well.

If you decide not to use the config.py file, make sure to pass the path to the DB file with -d to each
command. All further examples in this file assume you added the config file and will therefore omit
the -d option.

## Scanning your collection

Note: Soon we will eliminate the requirement to do a filesystem scan before also doing a subsonic
scan (if you plan to use subsonic). For now, do the file system scan, then the subsonic scan.

### Scan a collection on the local filesystem

Then prepare the index and scan a music collection. mp3, m4a, wma, OggVorbis, OggOpus and flac files are supported.

```
./resolve.py create music_index
./resolve.py scan music_index <path to mp3/flac files>
./resolve.py create
./resolve.py scan <path to mp3/flac files>
```

If you remove from tracks from your collection, use cleanup to remove refereces to those tracks:

```
./resolve.py cleanup
```

### Scan a Subsonic collection

To enable support you need to create a config.py file config.py.sample:

```
cp config.py.sample config.py
```

Then edit the file and add your subsonic configuration.

```
./resolve.py subsonic
```

This will match your collection to the remove subsonic API collection.


## Resolve JSPF playlists to local collection

Then make a JSPF playlist on LB:
Expand All @@ -42,7 +109,7 @@ curl "https://api.listenbrainz.org/1/playlist/<playlist MBID>" > test.jspf
Finally, resolve the playlist to local files:

```
./resolve.py playlist music_index input.jspf output.m3u
./resolve.py playlist input.jspf output.m3u
```

Then open the m3u playlist with a local tool.
Expand All @@ -52,9 +119,17 @@ Then open the m3u playlist with a local tool.
### Prerequisites

NOTE: This feature only works if you music collection
is tagged with MusicBrainz tags. (We recommend Picard:
http://picard.musicbrainz.org ) and if your music
collection is also available via a Subsonic API.
is tagged with MusicBrainz tags. We recommend Picard:
http://picard.musicbrainz.org for tagging your collection.

If you're unwilling to properly tag your collection,
then please do not contact us to request that we remove
this requirement. We can't. We won't. Please close this
tab and move on.

If you have your collection hosted on an app like Funkwhale,
Navidrom or Gonic, who have a Subsonic API, you can generate
playlists directly the web application.

### Setup

Expand All @@ -64,38 +139,53 @@ to download more data for your MusicBrainz tagged music collection.
First, download tag and popularity data:

```
./resolve.py metadata music_index
./resolve.py metadata
```

Then, copy config.py.sample to config.py and then edit config.py:
### Playlist generation

Currently artist and tag elements are supported for LB Local Radio,
which means that playlists from these two elements are made from the local
collection and thus will not need to be resolved. All other elements
may generate playlists with tracks that are not availalble in your
collection. In this case, the fuzzy search will attempt to make the
missing tracks to your collection.

For a complete reference to LB Radio, see:
[ListenBrainz Radio Docs](https://troi.readthedocs.io/en/latest/lb_radio.html)

The playlist generator works with a given mode: "easy", "medium"
and "hard". An easy playlist will generate data that more closely
meets the prompt, which should translate into a playlist that should
be easier and pleasent to listen to. Medium goes further and includes
less popular and more far flung stuff, before hard digs at the bottom
of the barrel.

This may not always feel very pronounced, especially if your collection
isn't very suited for the prompt that was given.

```
cp config.py.sample config.py
edit config.py
```

Fill out the values for your subsonic server API and save the file.
Finally, match your collection against the subsonic collection:
#### Artist Element

```
./resolve.py subsonic music_index
./resolve.py lb-radio easy 'artist:(taylor swift, drake)'
```

### Playlist generation
Generates a playlist with music from Taylor Swift and artists similar
to her and Drake, and artists similar to him.

Currently only tag elements are supported for LB Local Radio.

To generate a playlist:
#### Tag Element

```
./resolve.py lb-radio music_index easy 'tag:(downtempo, trip hop)'
./resolve.py lb-radio easy 'tag:(downtempo, trip hop)'
```

This will generate a playlist on easy mode for recordings that are
tagged with "downtempo" AND "trip hop".

```
./resolve.py lb-radio music_index medium 'tag:(downtempo, trip hop)::or'
./resolve.py lb-radio medium 'tag:(downtempo, trip hop)::or'
```

This will generate a playlist on medium mode for recordings that are
Expand All @@ -105,5 +195,44 @@ at the end of the prompt.
You can include more than on tag query in a prompt:

```
./resolve.py lb-radio music_index medium 'tag:(downtempo, trip hop)::or tag:(punk, ska)'
./resolve.py lb-radio medium 'tag:(downtempo, trip hop)::or tag:(punk, ska)'
```

#### Stats, Collections, Playlists and Rec

There are more elements, but these are "global" elements that will need to
have their results resolved to the local collection. The resolution process is
always a bit tricky since its outcome heavily depends on the collection. The
generator will do its best to generate a fitting playlist, but that doesn't
always happen.

For the other elements, please refer to the
[ListenBrainz Radio Docs](https://troi.readthedocs.io/en/latest/lb_radio.html)

## Other features

### Collection deduplication

The "duplicates" command will print a report of duplicate recordings
in your collection, based on MusicBrainz Recording MBIDs. There are several
types of duplicates that this may find:

1. Duplicated tracks with the same title, release and artist.
2. Duplicated tracks that live on different releases, but have the same name
3. Duplicated tracks that exist once on an album and again on a compilation.

If you specify -e or --exclude-different-release, then case #3 will not be shown.

### Top tags

The top-tags command will print the top tags and the number of times they
have been used in your collection. This requires that the "metadata"
command was run before.

### Unresolved Releases

Any tracks that fail to resolve to a local collection will have their
recording_mbid saved in the database. This enables the unresolved releases
report which specifies a list of releases that you might consider adding to your
collection, because in the past they failed to resolve to your location collection.

3 changes: 3 additions & 0 deletions config.py.sample
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Where to find the database file
DATABASE_FILE = ""

# To connect to a subsonic API
SUBSONIC_HOST = "" # include http:// or https://
SUBSONIC_USER = ""
Expand Down
64 changes: 64 additions & 0 deletions lb_content_resolver/artist_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import os
from collections import defaultdict
import datetime
import sys

import peewee
import requests

from lb_content_resolver.model.database import db
from lb_content_resolver.model.recording import Recording, RecordingMetadata
from lb_content_resolver.utils import select_recordings_on_popularity
from troi.recording_search_service import RecordingSearchByArtistService
from troi.splitter import plist


class LocalRecordingSearchByArtistService(RecordingSearchByArtistService):
'''
Given the local database, search for artists that meet given tag criteria
'''

def __init__(self):
RecordingSearchByArtistService.__init__(self)

def search(self, artist_mbids, begin_percent, end_percent, num_recordings):
"""
Perform an artist search. Parameters:

tags - a list of artist_mbids for which to search recordings
begin_percent - if many recordings match the above parameters, return only
recordings that have a minimum popularity percent score
of begin_percent.
end_percent - if many recordings match the above parameters, return only
recordings that have a maximum popularity percent score
of end_percent.
num_recordings - ideally return these many recordings

If only few recordings match, the begin_percent and end_percent are
ignored.
"""

query = """SELECT popularity
, recording_mbid
, artist_mbid
, subsonic_id
FROM recording
JOIN recording_metadata
ON recording.id = recording_metadata.recording_id
LEFT JOIN recording_subsonic
ON recording.id = recording_subsonic.recording_id
WHERE artist_mbid in (%s)
ORDER BY artist_mbid
, popularity"""

placeholders = ",".join(("?", ) * len(artist_mbids))
cursor = db.execute_sql(query % placeholders, params=tuple(artist_mbids))

artists = defaultdict(list)
for rec in cursor.fetchall():
artists[rec[2]].append({"popularity": rec[0], "recording_mbid": rec[1], "artist_mbid": rec[2], "subsonic_id": rec[3]})

for artist in artists:
artists[artist] = select_recordings_on_popularity(artists[artist], begin_percent, end_percent, num_recordings)

return artists
Loading