Tumblr Dead Links

Site and Policy » Tumblr Dead Links Search Posts
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
So with Tumblr banning and people quitting, a lot of very high traffic tumblr usernames were freed… and then snapped up by spam/scam/porn bots.
 
Example: >>1831863 (merged) (image is explicit)
 
This points to magnalunansfw.tumblr.com which seems to be owned by a porn site now, and redirects away from Tumblr (yes, I know, the irony of all this is insane).
 
Now, what should we do (apart from marking dead source? Can we report to have the source removed, or should we just change it ourselves to point to something safe (like google.com or perhaps the artist’s new site)?
 
EDIT  
In this case I pointed it to their new twitter NSFW feed and tagged it appropriately. Maybe we should have a forum asking for such “dead sources” for staff to mass tag?
Background Pony #FD89
Some of the sites being redirected to are insecure or actively attempt to compromise a visitor’s security, so I must advocate a mass removal by staff of all existing tumblr image sources.
 
Future re-sourcing and sourcing of new images to currently-legitimate tumblr links might be okay, but should be discussed, as tumblr’s current direction and technical difficulties create instability and insecurity for artists and sow a fertile field for future attacks of the same kind.
 
Because derpibooru user security is threatened by these links on derpibooru pages, this is not an action which can tolerate any further delay.
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
@Damaged  
If you would please report an example image that has the bad source, for a reason of “Other”, and let us know that the source is bad, then one of staff will look at it and see what can be done. One per artist is all that we need. Usually, I’ve been just deleting those sources when they show up, but it does have to be done by hand which can take a while.
 
I do have a personal project of updating all the verified artist links from the Tumblr diaspora, but other projects have been taking precedence over it, so maybe I’ll just devote a week to getting those cleaned up. That won’t fix all the Tumblr redirects, but I will clean up those bad sources as I find them.
 
@Background Pony #7285  
The majority of Tumblr sources are still fine. Removing all Tumblr sources would be unfair to artists whose Tumblr blogs were unaffected by the diaspora, or who are still accessible through the Tumblr “console” thing they’re doing now.
 
Maybe something can be done programmatically at some point, but for now please just report an image when you find that the source has been hijacked, as I mentioned above.
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
@Damaged  
Actually that “everything explicit” search is not true, many of those lead to the new Tumblr console thing. I thought they’d be gone by now, but a lot of the, are still available, just … inconvenient.
 
I’ve grabbed the ones you already reported … which means you’ve maxed your reports already. Huh. That’s probably not a good thing, and might flood the reports, too.
 
Maybe you’re right about making a thread to report these. Let me think about it for a bit and I’ll make one once I figure out what data we need and how to manage it.
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
@Damaged  
I’m sorry I might have miscommunicated, and might have misunderstood what you were referring to in your first post. After looking through your reports some of those don’t point to hijacked sites or malware.
 
If you find a source url that points to a hijack or malware site, please do report those. But if the source is just dead, then adding “Dead Source” as a tag is fine.
 
We have been seeing hijacked Tumblr URLs, but really not that many. And the hijacks that I’ve worked on don’t seem to have anything to do with whether the URL is dead or an active ongoing blog.
 
So, I agree these dead links are a concern, but I don’t think every dead source is a potential hijack, any more than any of the “live sources” are.
 
And, it would be good to not delete all those old URLs without having a valid new source, because they’re handing as a part of our internal processes, even if they’re dead. For example, I have used them to sort out disputes about who the original artist was, or if something goes wrong with aliasing and I need to sort out one artist tag into two tags.
 
So, for myself, I think it’s better to just report the source URLs that point to hijacks or malware, and I’ll keep make updating all the verified links from the diaspora a front-burner project, instead of a back-burner.
 
I’ll take care of those reports so that you aren’t jammed up against the max report thing, and thank you for letting us know about the mess with the MagaLuna URLs. I’ll start zapping those.
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
Just to make sure I’m communicating well, the source for images like this “look dead”:
 

 
But you can still get to the original image and blog via the “console” on Tumblr.
 
On the other hand, the source on this image:
 

 
Is radioactive malware that no one should click on unless they’re on a very secure system.
 
The source on the first image is fine, and the source is not dead. It’s just inconvenient.
 
The source on the second image should be reported, and I will go and delete all of those now. Because the source will be deleted, there’s no need to tag the image “Dead Source”.
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
@Damaged  
Having a list of source urls that have been hijacked would be nice. If you figure out how to do that, please PM me the list - preferably one unique link for each subdomain/artist tag would be ideal. The tricky bit is that some of the redirects are intentional and either to Tumblr’s new “Console” view of those same blogs, or to artist’s own domains, which are links we still want to respect and preserve. And my client-fu gets fuddled when I am trying to talk with Tumblr. That site’s user interface doesn’t lend itself well as an API.
 
I still like to contact the artist to see how they prefer these hijacks be handled, though. Sometimes artists want to re-home the URLs to a new gallery, sometimes they just want everything pointed to a new Patreon or DeviantArt gallery index page or something like that. So, we would still do the cleaning up largely by hand, unless there was suddenly a very large volume of them.
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
@Derpy Whooves  
I guess, ultimately, just trying to fetch the Tumblr post through the API would be enough to catch if the source needs to be looked at further.
 
https://api.tumblr.com/v2/blog/magnalunansfw.tumblr.com/posts?id=178047210371 (adding an api_key of course)
 
For example with the magnalunansfw one. And now I find out Tumblr’s API doesn’t support HEAD, so I have to make sure to get data each time. Bleh.
 
Anyway. That’d need to basically step through any image that has a source that is directly recognizable as a tumblr post.
 
Linking directly to images that are gone will not be a problem, but it is when people link to the blog itself that it’ll be a little bit harder to detect, since the blog exists but just redirects elsewhere immediately.
 
https://api.tumblr.com/v2/blog/magnalunansfw.tumblr.com/info
 
"blog": { "ask": true, "ask\_anon": false, "ask\_page\_title": "Ask me anything", "can\_subscribe": false, "description": "", "is\_nsfw": false, "name": "magnalunansfw", "posts": 10, "share\_likes": false, "subscribed": false, "title": "moms-r-us", "total\_posts": 10, "updated": 1544792506, "url": "https://magnalunansfw.tumblr.com/", "uuid": "t:yW3JopGKeNFJZbtSGsxDpA", "is\_optout\_ads": false }
 
There are some obvious warning flags. Empty description, very low post count, and I bet that title is something they either use a lot or have a pool of titles to label things.
 
When I get a chance, will see about (very slowly, 1/s or so) trawling source data on here through the API to see if I can build a list of these things and look for patterns.
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
@Damaged  
Yeah, it’s unfortunately not very straight forward.
 
Like I mentioned above, images like this [NSFW] >>1955071 are good examples of blogs that are supposed to be gone, but are still available - albeit after 2 or 3 additional “Yes I want to see it please” clicks.
 
So, we don’t want to programmatically remove every above-safe Tumblr source URL, because a surprisingly large number of them still work (albeit in a very weird way), so the ones that I’m trying to focus on catching are only the hijacked ones that we really don’t want people following.
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
@Derpy Whooves  
Right. Those “above safe” tumblr posts, however, still give valid data back through the API. That one in particular replies with all the usual data.
 
So the problem is twofold:
 
Finding dead links  
If all links to a blog are dead, and the blog itself still says it is there, it would be a suspect for a bot takeover  
If only some links to that blog are dead, those would go into a list “to be marked dead source”  
Finding dead blog (404)  
Might as well tag all links to that blog as dead source  
Finding living links  
Do nothing
 
Then go through the suspected bot takeover list and examine the blog themselves for more suspect stuff. Perhaps even opening the page and seeing if it tries that redirect trick.
 
As an example, that link you gave gives valid data, it just has the meta flag “x_tumblr_content_rating”: “adult” on it. A blog taken over will never give back valid data using this endpoint on a post:
 
https://api.tumblr.com/v2/blog/htpot.tumblr.com/posts?id=182123278797
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
Ran a test on the oldest 11 pages of tumblr sourced images. Got 2 false positives on the “suspected bots” listing, and a few more hostiles.
 
With the dead links, do you just want the IDs or the URLs?
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
For myself, I really am focusing on fixing and re-verifying all the artist “User Links” that were lost as a part of the Tumblr diaspora - and if I find one that has been hijacked or redirected to malware or porn sites - regardless of whether that was originally a Tumblr link or not - then I work with the artist to get those links either re-homed, or deleted.
 
The hijacked source urls problem does occur with non-Tumblr sites, as well - it is a problem that happens with all kinds of blogs - safe and non-safe. And while so far the majority of the ones that I’ve worked on have been Tumblr blogs, I have seen them with Twitter, and even DeviantArt galleries and homebrew sites that have been lost to the artist and repurposed by whomever either hacked or bought the domain out from underneath the artist.
 
So - if you found any redirects, please PM me those and I’ll take them on as a top priority to get fixed.
 
But if the link is just dead, but following the link you still end up looking at Tumblr or whatever site the URL points to in any way - even if it’s just the boilerplate “There’s nothing here” Tumblr display, then that seems like a much lower priority problem.
 
So - yes, please PM me any redirects that you found, because I really want to make sure we get those off of the site.
 
But I really am not going to be looking at dead sources that aren’t redirects any time soon.
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
Almost done grabbing… ~330000 images of data (single thread, pausing between fetches, and running it on home internet rather than my EC2). Will start crunching those for a list of suspected bots. Making sure to save all the data incrementally, don’t want to have to attack your server again.
 
There’s a few blogs I noticed where the owner has deleted all their content and info from them, and left either 1 post or no posts to say they are gone. These I need to try to filter out. I may have to do an extra pass and actually follow the links to see if they redirect.
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
@Damaged  
You should only have to look once per artist tag - if one of their source urls is bad, then all of them would be bad. So, can you try doing it once per artist, rather than per image?
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
@Derpy Whooves  
Yeah, will be avoiding doing as much API hitting as I can on the Tumblr side, since unlike Derpibooru I can’t just get 15 things processed at a time (read: this may take a day or so)
Derpy Whooves
Preenhub - We all know what you were up to this evening~
Artist -
My Little Pony - 1992 Edition
Artistic Detective - For awesome dedication to sleuthing out and maintaining artist tags and links
Economist -
Not a Llama - Happy April Fools Day!

Looking For My Doctor
@Damaged  
No rush - seriously - I’ve got almost 400 artists still to re-verify and update links for. I will be working on this a little very day until probably March.
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
Heh, the actual tumblr pages seem to use the following to redirect:
 
window.location.href = "http://bit.ly/2PrymKa";
 
I bet pretty much all of them are using bit.ly links, so those should be easy to locate.
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
Well, 1147 files, each containing 300 images worth of data.
 
Now starting the slow grind of tumblr API. One check if it’s just a blog, two checks if it’s a link to a blog post. No checks if it’s hit that blog and found it dead/bot or hit that blogpost and found it dead.
 
That should mean the more it does the faster it goes through them.
Damaged
Non-Fungible Trixie -
Fine Arts - Two hundred uploads with a score of over a hundred (Safe/Suggestive)
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)
Notoriously Divine Tagger - Consistently uploads images above and beyond the minimum tag requirements. And/or additionally, bringing over the original description from the source if the image has one. Does NOT apply to the uploader adding several to a dozen tags after originally uploading with minimum to bare tagging.
Magnificent Metadata Maniac -
Wallet After Summer Sale -
Equality - In our state, we do not stand out.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Happy Derpy! - For Patreon supporters

Word Bug
Okay. On the last pass now to verify the blogs have the redirect string on them.
 
I am pondering the logistics of actually making a program that crawls through the site’s sources image by image (because then it could use cache/etag to avoid poking everything) and then build a series of modules that it calls to search for exploits in the links in the descriptions and in the source_url.
 
Until Tumblr smarten up their act in regard to stopping bots/redirects, this will be an ongoing problem (though probably ever decreasing as their site seems to be imploding from a pony-posting perspective).
Interested in advertising on Derpibooru? Click here for information!
Pony Arts & Prints!

Help fund the $15 daily operational cost of Derpibooru - support us financially!

Syntax quick reference: **bold** *italic* ||hide text|| `code` __underline__ ~~strike~~ ^sup^ %sub%

Detailed syntax guide