Friday, August 10, 2012

Music support for the MetaData Extractor

Just a quick update on Nepomuks meta data extractor.

I have added support to music files via MusicBrainz. From now on you can easily get additional data from there. To get some additional search parameters the id3 tags are read in via TagLib.


Now all "important" file types (documents, video, music)  are handled by this little helper.
The next step will be some UI cleanup to increase the usability.

10 comments:

  1. Joerg, am i missing something?
    metadataextractor -t -a \[01\]\ Buzzards\ +\ Worms.mp3
    metadataextractor(12695) NepomukMetaDataExtractor::Extractor::ExtractorFactory::loadScriptInfo: available interpreters ("javascript", "qtscript")
    Kross: "Loading the interpreter library for javascript"
    Kross: "Successfully loaded Interpreter instance from library."
    Kross: "Loading the interpreter library for qtscript"
    Kross: "Successfully loaded Interpreter instance from library."
    metadataextractor(12695) NepomukMetaDataExtractor::Extractor::ExtractorFactory::loadScriptInfo: look for plugins in the folder: "/usr/share/kde4/apps/nepomukmetadataextractor/plugins/"
    metadataextractor(12695) NepomukMetaDataExtractor::Extractor::ExtractorFactory::loadScriptInfo: look for plugins in the folder: "/home/joerg/Development/KDE/metadataextractor/lib/webextractor/plugins/"
    metadataextractor(12695) NepomukMetaDataExtractor::Extractor::ExtractorFactory::loadScriptInfo: found 0 search plugins
    metadataextractor(12695) NepomukMetaDataExtractor::Extractor::ResourceExtractor::addFilesToList: skip file KUrl("file:///home/hrvoje/Music/80's/Banyan-Anytime At All/[01] Buzzards + Worms.mp3") because it already has some meta data that would be overwritten use force update to fetch meta data anyway
    metadataextractor(12695) NepomukMetaDataExtractor::UI::AutomaticFetcher::startFetcher: Start fetching meta data for 0 items
    metadataextractor(12695) NepomukMetaDataExtractor::UI::AutomaticFetcher::searchNextItem: Finished fetching all items



    python /usr/share/kde4/apps/nepomukmetadataextractor/plugins/musicbrainz.py
    Plugin information:
    resource: ['music']
    description: MusicBrainz is an open music encyclopedia that collects music metadata and makes it available to the public.
    icon: musicbrainz.png
    author: Joerg Ehrichs
    errorMsg:
    urlregex: ['http://musicbrainz.org/recording.*']
    isAvailable: True
    identifier: musicbrainz
    email: joerg.ehrichs@gmx.de
    name: MusicBrainz

    ReplyDelete
    Replies
    1. it says
      "...addFilesToList: skip file KUrl("...Banyan-Anytime At All/[01] Buzzards + Worms.mp3") because it already has some meta data that would be overwritten use force update to fetch meta data anyway "

      please add the -force option.
      Otherwise the program skips all files that already has "valid" meta data. Usually this is created from the fileindex to read the id3tags.

      so start it with
      metadataextractor -a -f \[01\]\ Buzzards\ +\ Worms.mp3

      or
      metadataextractor -f \[01\]\ Buzzards\ +\ Worms.mp3
      to have the gui (havn't actually checked the automatic version.
      MusicBrainz does not give the very best results as first search result so the pure automatic version might lead to wrong metadata fetched.

      I'll will work out something that does a small check on all returned results and how likely it is that one of it is the one you wanted.

      Delete
    2. OK, but i'm more worried that it doesn't find any plugin:
      metadataextractor(12695) NepomukMetaDataExtractor::Extractor::ExtractorFactory::loadScriptInfo: found 0 search plugins

      Also, no search engines are listed in GUI

      This one was an example only, same thing with tvshows/videos

      Delete
    3. oh this shouldn't happen.

      did you install the extractor or just run make and execute it from there?

      the python plugins are supposed to be in
      /usr/share/kde4/apps/nepomukmetadataextractor/plugins/

      Delete
  2. Yup:

    ls -l /usr/share/kde4/apps/nepomukmetadataextractor/plugins/
    total 92
    -rw-r--r-- 1 root root 988 Aug 10 18:26 imdb.png
    -rw-r--r-- 1 root root 8908 Aug 10 18:26 imdb.py
    -rw-r--r-- 1 root root 771 Aug 10 18:26 microsoft-academic-search.png
    -rw-r--r-- 1 root root 14758 Aug 10 18:26 microsoft-academics.py
    -rw-r--r-- 1 root root 554 Aug 10 18:26 musicbrainz.png
    -rw-r--r-- 1 root root 9282 Aug 10 18:26 musicbrainz.py
    -rw-r--r-- 1 root root 27684 Aug 10 18:26 tvdb.png
    -rw-r--r-- 1 root root 9268 Aug 10 18:26 tvdb.py

    ReplyDelete
    Replies
    1. Oh, and if you're wondering, i'm running KDE trunk on openSuSE 12.2

      Delete
    2. ahh suse i remeber something (and just saw in your output above)

      metadataextractor(12695) NepomukMetaDataExtractor::Extractor::ExtractorFactory::loadScriptInfo: available interpreters ("javascript", "qtscript")

      you don't have the python interpreter for the KROSS Framework

      There is a small hint about this in the README

      OpenSuse user need to install the python Kross intepreters from source. see:
      https://projects.kde.org/projects/kde/kdebindings/kross-interpreters/repository

      That's a hint i found while searching for KROSS a while ago. never tested it though

      Delete
    3. Cool, got it ;)
      Working now, thanks! :)
      Great work!

      Delete
  3. You need to have a separate python Kross package installed on Ubuntu as well. I wonder if a CMake warning could be added in case of missing scripting support?

    ReplyDelete
  4. Having crashes with music files (when trying within dolphin):
    #6 0x00007f17042bccfb in NepomukMetaDataExtractor::Extractor::AudioExtractor::findByTag (this=this@entry=0x7fff97d2f5a0, mdp=mdp@entry=0x2929630) at /usr/src/debug/nepomuk-metadata-extractor-1345110377/lib/resourceextractor/audioextractor.cpp:48
    #7 0x00007f17042bd23a in NepomukMetaDataExtractor::Extractor::AudioExtractor::parseUrl (this=0x7fff97d2f5a0, mdp=0x2929630, fileUrl=...) at /usr/src/debug/nepomuk-metadata-extractor-1345110377/lib/resourceextractor/audioextractor.cpp:41
    #8 0x00007f17042b54da in NepomukMetaDataExtractor::Extractor::ResourceExtractor::fileChecker (this=this@entry=0x1e6e3d0, mdp=mdp@entry=0x2929630, fileUrl=...) at /usr/src/debug/nepomuk-metadata-extractor-1345110377/lib/resourceextractor/resourceextractor.cpp:240
    #9 0x00007f17042b69b3 in NepomukMetaDataExtractor::Extractor::ResourceExtractor::addFilesToList (this=0x1e6e3d0, fileUrl=...) at /usr/src/debug/nepomuk-metadata-extractor-1345110377/lib/resourceextractor/resourceextractor.cpp:208
    #10 0x00007f17042b718d in NepomukMetaDataExtractor::Extractor::ResourceExtractor::lookupFiles (this=0x1e6e3d0, fileOrFolder=...) at /usr/src/debug/nepomuk-metadata-extractor-1345110377/lib/resourceextractor/resourceextractor.cpp:112
    #11 0x00000000004029a3 in main (argc=2, argv=0x7fff97d2fcb8) at /usr/src/debug/nepomuk-metadata-extractor-1345110377/extractor/main.cpp:111


    If you're aware of this, that's OK, just checking :)

    ReplyDelete