I’m a systems librarian in an academic library. I moved over the Lemmy after Rexxit 2023. I’ve had an account on sdf.org since 2009 (under a different username), and so I chose this instance out of a sense of nostalgia. I do all sorts of fiber arts (knitting, cross stitch, sewing) and love dogs.

  • 1 Post
  • 142 Comments
Joined 2 years ago
cake
Cake day: July 3rd, 2023

help-circle
  • grysbok@lemmy.sdf.orgtoProgrammer Humor@programming.devlads
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    33 minutes ago

    I mean, I enjoy linux sysadmining, but fighting bots takes time, experimentation, and research, and there’s other stuff I should be doing. For example, accessibility updates to our websites. But, accessibility doesn’t matter a lick if you can’t access the website anyway due to timeouts.


  • grysbok@lemmy.sdf.orgtoProgrammer Humor@programming.devlads
    link
    fedilink
    English
    arrow-up
    1
    ·
    36 minutes ago

    Yep, they’ll just burn taxpayer resources (me and my poor servers) because it’s not like they pay taxes anyway (assuming they are either a corporation or not based in the same locality as I am).

    There’s only one of me and if I’m working on keeping the servers bare minimum functional today I’m not working on making something more awesome for tomorrow. “Linux sysadmin” is only supposed to be up to 30% of my job.



  • grysbok@lemmy.sdf.orgtoProgrammer Humor@programming.devlads
    link
    fedilink
    English
    arrow-up
    5
    ·
    4 hours ago

    I just looked at my log for this morning. 23% of my total requests were from the useragent GoogleOther. Other visitors include GPTBot, SemanticScholarBot, and Turnitin. That’s the crawlers that are still trying after I’ve had Anubis on the site for over a month. It was much, much worse before, when they could crawl the site, instead of being blocked.

    That doesn’t include the bots that lie about being bots. Looking back at an older screenshot of a monitors—I don’t have the logs themselves anymore—I seriously doubt I had 43,000 unique visitors using Windows per day in March.


  • grysbok@lemmy.sdf.orgtoProgrammer Humor@programming.devlads
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    5 hours ago

    Timing and request patterns. The increase in traffic coincided with the increase in AI in the marketplace. Before, we’d get hit by bots in waves and we’d just suck it up for a day. Now it’s constant. The request patterns are deep deep solr requests, with far more filters than any human would ever use. These are expensive requests and the results aren’t any more informative that just scooping up the nicely formatted EAD/XML finding aids we provide.

    And, TBH, I don’t care if it’s AI. I care that it’s rude. If the bots respected robots.txt then I’d be fine with them. They don’t and they break stuff for actual researchers.


  • grysbok@lemmy.sdf.orgtoProgrammer Humor@programming.devlads
    link
    fedilink
    English
    arrow-up
    11
    ·
    6 hours ago

    You’re right. AI didn’t just triple the traffic to my tiny archive’s site. It way more than tripled it. After implementing Anubis, we went from 3000 ‘unique’ visitors down to 20 in a half-day. Twenty is a much more expected number for a small college archive in the summer. That’s before I did any fine-tuning to Anubis, just the default settings.

    I was getting constant outage reports. Now I’m not.

    For us, it’s not about protecting our IP. We want folks to get to find out information. That’s why we write finding aids, scan it, accession it. But, allowing bots to siphon it all up inefficiently was denying everyone access to it.

    And if you think bots aren’t inefficient, explain why Facebook requests my robots.txt 10 times a second.











  • grysbok@lemmy.sdf.orgto196@lemmy.blahaj.zoneRule
    link
    fedilink
    English
    arrow-up
    34
    ·
    16 days ago

    I think I’d be pretty distracted if they also declared the anti-trans rhetoric was all a mistake and actually we love and support LGBTQ+ of all stripes and here’s a big pile of money as an apology for earlier. Also we removed the sex and/or gender markings from all American passports.

    That would help distract me.