9 comments

  • __float 1 hour ago
    I don't know what resolution or display you built this on, but a heads up the initial impression on my 4K monitor is that everything is incredibly tiny.
    • alder 1 hour ago
      To be honest I haven't tested it on a 4K monitor yet, so I am not surprised. There are two controls above the transcript that change the font size and the line spacing, which should help a bit for now. Something to fix, thanks!
  • jrrv 1 hour ago
    Is it possible to add traditional characters for mandarin?

    Also the pinyin for 誰/谁 is coming through as shuí, whilst this character has two pronounciations, I believe shéi is the more common one.

    • alder 40 minutes ago
      Thanks! Chinese and Japanese as source languages are still experimental, I did my best to support them but I have to rely on people who actually know the language and this kind of feedback is really useful. I'll look into adding traditional characters and fixing the pinyin.
      • jrrv 35 minutes ago
        No worries, I appreciate the effort. I did go back and listen and they are indeed pronouncing sheí in the audio too.

        I use a firefox extension to convert simplified to traditional, looks like it's open source so that may be of some use to you: https://github.com/tongwentang/tongwentang-extension.

        Although there are some clashes that it does not handle, e.g. 隻 and 只 are both 只 in simplified, you just have to know which one it is from context, but the extension fails to convert to 隻 where appropriate.

  • pzagor2 20 minutes ago
    I also built a tool to help me study Spanish. I really like the idea of shadowing, so I built a tool that lets you take any YouTube video and generate a sentence-by-sentence exercise to help you repeat the speaker's phrases.

    https://talkhabit.com/shadow Or example, of one exercise: https://talkhabit.com/shadow?videoUrl=https%3A%2F%2Fwww.yout...

    Stuff I need to work on: - It only works with videos that have auto-generated captions - It works best with monologue videos

  • deaton 8 minutes ago
    This is really cool, just as I'm starting to get towards the back end of the Kaishi 1.5k deck so this will be perfect for my Japanese studies. Thanks for sharing.
  • jcg591 44 minutes ago
    Very cool! I'm also learning Greek and it's amazing how many resources are becoming available.
    • alder 29 minutes ago
      Thanks! Yes, it's getting better for Greek but still not on par with other languages. I completed the only 2 Greek levels on Duolingo and they are really boring compared to the German one I am doing now. Easy Greek is a bit above my level, and the number of YouTubers in Greek is tiny compared to German.
  • Koaisu 1 hour ago
    Just tried it with an unsupported language and it still worked I set it to Chinese and inputted the audio. Still got correct results.
  • 3stacks 1 hour ago
    This is awesome! I’ll be lurking for new data sources. I’m working on a self-hosted language app more focused around cloze and sentence mining into Anki. I love seeing more stuff happening in this space
    • alder 1 hour ago
      Thanks! I am glad you like it! I essentially mine the source audio, and all examples have cloze style gaps (blurring, in my case) that are revealed on the back of the card. I also beep the word in the sentence when you try to play it on the front card in built-in SRS system. Unfortunately that is not implemented in the Anki export, but it is technically possible.
  • dirteater_ 52 minutes ago
    What are you doing for Chinese word segmentation/pinyin?
    • alder 10 minutes ago
      For segmentation and POS I rely on spaCy zh_core_web_sm, pinyin from pypinyin library. Also the small correction level on top. But I am not a Chinese language expert to judge if it really works and I'll rely on feedback from the users to improve it.
  • hiAndrewQuinn 1 hour ago
    Very nice work. I'm going for a different thing, but my audio2anki tool [1] is about as streamlined as I could make it to turn a YouTube URL I want to learn into a stack of Anki flashcards, purely locally.

    [1]: https://github.com/hiAndrewQuinn/audio2anki