Source: https://front-end.social/@fox/110846484782705013

Text in the screenshot from Grammarly says:

We develop data sets to train our algorithms so that we can improve the services we provide to customers like you. We have devoted significant time and resources to developing methods to ensure that these data sets are anonymized and de-identified.

To develop these data sets, we sample snippets of text at random, disassociate them from a user’s account, and then use a variety of different methods to strip the text of identifying information (such as identifiers, contact details, addresses, etc.). Only then do we use the snippets to train our algorithms-and the original text is deleted. In other words, we don’t store any text in a manner that can be associated with your account or used to identify you or anyone else.

We currently offer a feature that permits customers to opt out of this use for Grammarly Business teams of 500 users or more. Please let me know if you might be interested in a license of this size, and I’II forward your request to the corresponding team.

      • CaptObvious@literature.cafe
        link
        fedilink
        arrow-up
        4
        ·
        1 year ago

        They’d’ve gotten it wrong too. Prepositions and postpositions are their own category of linguistic hell, especially in idioms and phrasal verbs.

        • SheeEttin@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          1 year ago

          They’dn’t’ve necessarily gotten it wrong. With a big enough dataset, an ML tool should be pretty accurate, at least in that it will make the same choices as most people have made in their writing.

          • CaptObvious@literature.cafe
            link
            fedilink
            arrow-up
            2
            ·
            1 year ago

            They’d’n’tve

            Apostrophe mistakes aside, no native speaker would stack contractions like this. There’s an upper limit of three words in a single contracted form. It would be “They wouldn’t’ve gotten” or “They’d not’ve gotten.”

            ML tools don’t write grammatically correct complex sentences precisely because their training sets contain too many discrepancies. They may learn how to apply prescriptive rules consistently one day, perhaps even one day soon, but this is not that day.