I am making a Unofficial Reddit API, which mimics the official one.

Its early days, but I would like to have a discussion here about it since my post was blocked on reddit(of course).

Let me know what you think of the project, if you have any input, let me know.

  • Emily (she/her)@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    11
    ·
    edit-2
    4 months ago

    Is there a reason you’re scraping data rather than attaching a network sniffer/reverse engineering the official apps and documenting the results? Or map the RSS feed to an API? The main thrust behind my comment is that I think scraping is pretty fragile, so I’m interested as to why other options are infeasible.

    • MHLoppy@fedia.io
      link
      fedilink
      arrow-up
      14
      ·
      4 months ago

      There’s currently no implementation (the repos are currently just skeletons), so it could just be a semantics difference right now.

      • nyan@lemmy.cafe
        link
        fedilink
        English
        arrow-up
        17
        ·
        4 months ago

        This is likely to be C&D’d as well if it ever reaches the point where it does anything useful (remember, reddit doesn’t need grounds that would hold up in court to send a C&D).

        • Anon Coder@discuss.onlineOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 months ago

          Don’t worry, it won’t be a problem. I have taken reasonable measures to ensure my anonymity. and also you can’t really kill free/libre software easily anyways.

              • Enoril@jlai.lu
                link
                fedilink
                English
                arrow-up
                2
                ·
                4 months ago

                I know, he is also hosted on a german association with the same id. Both github and the association will have to follow the laws anyways.

      • Emily (she/her)@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        4 months ago

        I suspect that any of the methods proposed here would be prone to a C&D, but IMO the safest legally would probably be the RSS method (not a lawyer though). Reddit’s RSS feeds are public, documented, and available without the need for private APIs, authentication, or an API key, so I don’t see how they could claim that a wrapper is unauthorised/illegal. Documenting their private API however seems like a gray area. Google LLC v. Oracle America, Inc. found that APIs are copyrightable, but this use may constitute fair use.

    • Anon Coder@discuss.onlineOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 months ago

      Because we need to retain the breadth of functionality the API has, if you want to just scrape posts, APIs for that already exist, but i am aiming for something more.

      About reverse engineering, they can change that part at any time too, and may be even more fragile as they can change that without breaking the UX, if they change the front page CSS selectors or layout for example, it will effect the UX more as it changes the expected output, not the middle end that is just raw data.

      Thats my reasoning, I appreciate the input though (:

      • Emily (she/her)@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        4 months ago

        Making a breaking change to the mobile API also breaks old outdated installations of the app. Websites and their APIs are usually synced, apps not so.

        If they were really motivated to stop your method, they could just obfuscate the frontend with webpack and break your scraper every time they make an update.