Shop Mobile More Submit  Join Login

Archiving like a boss! (UPDATE)

Journal Entry: Mon Mar 25, 2013, 8:37 AM
Stupid me deleted all the old rips. I'm working on getting a new one.

Update: Current as of 8/29/13. New download link: docs.google.com/file/d/0B38kVw…

Most of the description below still applies. Download is now in 7z format www.7-zip.org/ to avoid breaking some poor computer that tries to open it with the clunky windows zipper.

Addendum: This time, the archive rip consists of 83,784 files. No, they haven't written quite that many new fics in six months, I just didn't prune the lowest size entries this time. Previously, I deleted everything below 3 KB. These small files are either fics where no more than the title has been written, epubs that are broken for one reason or another (I suspect they're the very first stage of pre-publishing), or a few empty placeholders. Whatever they are, you can safely ignore them unless you're specifically looking for them and you know they're supposed to be complete, in which case you can download a replacement manually. I just kept everything so I'd have an exhaustive title reference while I work through the five-stars and four-stars so I don't run into anything that is completely missing. It's a bit less work now, since these come *almost* fully tagged, but I still have to gather them manually.

Also, remember... This has EVERYTHING on Fimfiction. Use caution unless you want to be scarred for life.






So I've managed to pull a copy of (as far as I can tell) every single story on FimFiction in EPUB format. That's forty-thousand files! Even with the best download manager I could find, using FimFiction's sequential numbering scheme, queueing 40,000 files was almost more than my poor computer could handle.

It's a raw rip, so the file names are a bit messy, for example "its-always-sunny-in-fillydelphia-story=20936.epub", though the Title and Author tags, at least, are intact. It weighs 1.13 GB, is current as of February 9, and may take up to 20 minutes to decompress due to the sheer number of files. If you are feeling adventurous, or are a fellow archive warrior or fanfic connoisseur, here's the download link. docs.google.com/file/d/0B38kVw… (Warning, this has EVERY fanfic on FimFiction, including the ones I'd rather not think about, and the ones that aren't much more than a title.)

I'm also in the process of making a much more organized and fully tagged Kindle Collection using only the fanfics that were good enough to make it to Equestria Daily. I am tagging each one, including their descriptions, completely by hand, thus it's obviously taking a while, and isn't ready for release yet. After 5 months, and 900 fanfics later, I've finished tagging the entirity of Nallar's Collection (nallar.me/fics/?order=length). Since that collection only contains fics that are hosted on Google Docs, I still have more work to do to get the others, thus the crazy FimFiction rip above. If that number on EqD is right, I have about 1500 more to go.

Obviously this endeavor will never really be complete, since new fics are being written all the time, but if I get enough interest, I will release the collection as it currently exists on my Kindle (just the stuff from Nallar).

(You may also be interested in my BGM and Everfree collections. fav.me/d5vrqn4)

(In case you were wondering, yes, I am indeed completely insane.)


  • Mood: Dumbfounded
  • Listening to: Everfree Radio
  • Reading: Twilight October
  • Watching: MLP-FIM (as usual)
  • Playing: Half Life 2
  • Eating: Hay fries
  • Drinking: Zap Apple Cider
Add a Comment:
 
:icondarkfur18:
Darkfur18 Featured By Owner Mar 23, 2014
I have every story of any importance (It has at least 1 chapter) archived from FIMfiction plus story data and images, ready to be hosted on an offline server.
It's a hefty 3.5 gigs.
Reply
:iconlahirien:
Lahirien Featured By Owner Mar 23, 2014  Hobbyist General Artist
When is it from?
Reply
:icondarkfur18:
Darkfur18 Featured By Owner Mar 23, 2014
I built a Linux program that does it automatically. It goes through the website like a boss.
Reply
:iconlahirien:
Lahirien Featured By Owner Mar 23, 2014  Hobbyist General Artist
Oh awesome! You sound like Nallar! nallar.me/fics/?order=length

I'm into archiving, so what I need is a script that can download all the epubs pointed at by download_epub.php?story=[1-nnnnnn] (not just the php pages like wget seems to do), ignore html files that mean there's no story for that number, then run as a cron job daily to top itself off. Is this something that can be done in a linux environment?

Also, where are all the views coming from on this old journal?
Reply
:icondarkfur18:
Darkfur18 Featured By Owner Mar 23, 2014
First, yes. With a little modification my script will do exactly what you want.
Second, Library of Equestria thread on /mlp/
Reply
:iconlahirien:
Lahirien Featured By Owner Mar 23, 2014  Hobbyist General Artist
Would it be possible for me to get a copy of your script for personal use? I have a little Linux box and NAS I've been playing around with that I think would be perfect for that. I can probably do the modifications myself, it's just getting something working from scratch that I never seem to have time for (thus my having to resort to DownloadStudio).
Reply
:icondarkfur18:
Darkfur18 Featured By Owner Mar 23, 2014
Here: www.dropbox.com/s/c7dqw2mthcm9…

Open terminal in folder containing files and type ./miner to run,
Super simple 2 question prompt!

Check the .epub files before you really get started because I don't have an epub reader, and tell me how it works.
Reply
:icondarkfur18:
Darkfur18 Featured By Owner Mar 23, 2014
Whoops, wrong one.
Here: www.dropbox.com/s/c7dqw2mthcm9…
Reply
(2 Replies)
:iconshadowflares:
Shadowflares Featured By Owner Mar 23, 2014
God tier archiving 
Reply
:iconsevensix1:
sevensix1 Featured By Owner Mar 22, 2014
Did you check for fics that were edited or deleted between the two rips?
Reply
:iconlahirien:
Lahirien Featured By Owner Mar 22, 2014  Hobbyist General Artist
No, but as I work on the Kindle collection, I occasionally find some that are missing from the latest rip but which can be found in the older versions. I don't think it's possible to check the whole rip for changes due to the way FimFiction works. Plus, doing anything to 100,000 files is not easy.
Reply
:iconpinki3krew-pinkydash:
PINKI3KREW-Pinkydash Featured By Owner Aug 8, 2013
Now how on earth did you pull all these .epub files from Fimfiction? HTTRACK tends to only pull files of this sort when attempting to download epub files "download_epub.php?story=1766"
Reply
:iconlahirien:
Lahirien Featured By Owner Aug 8, 2013  Hobbyist General Artist
It wasn't httrack. I only used that for Nallar's site. For this I used a regular download manager with sequential queuing to download everything pointed at by download_epub.php?story=[1-100000] or whatever the syntax was.
Reply
:iconpinki3krew-pinkydash:
PINKI3KREW-Pinkydash Featured By Owner Aug 8, 2013
The reason I ask is because I'm attempting to do something simlar using wget [wget --recursive www.fimfiction.net/download_ep… ] however I've only been able to pull the php pages themselves.
Reply
:iconlahirien:
Lahirien Featured By Owner Aug 12, 2013  Hobbyist General Artist
Okie dokie... I'm back, and it was a trial version of DownloadStudio www.conceiva.com/products/down… that I used.
Reply
:iconpinki3krew-pinkydash:
Thanks for replying!
Me and a friend came up with a different solution though. He wrote a program designed specifically to grab .epub files off of FimFiction.



Reply
:iconpinki3krew-pinkydash:
PINKI3KREW-Pinkydash Featured By Owner Aug 8, 2013
Which download manager specifically? 
Reply
:iconlahirien:
Lahirien Featured By Owner Aug 8, 2013  Hobbyist General Artist
It took me a long time to find one that could handle that. Wget doesn't work because it only pulls the download instructions on the php page rather than its target. I'm on vacation at the moment, so it will be Monday before I can get back to my computer and see what it was called.
Reply
:iconpinki3krew-pinkydash:
PINKI3KREW-Pinkydash Featured By Owner Aug 8, 2013
Alright. Thanks for Helping Out :)
Reply
:iconpinki3krew-pinkydash:
PINKI3KREW-Pinkydash Featured By Owner Aug 8, 2013
A regular download manager?
Reply
:iconthelordfanboy:
TheLordFanboy Featured By Owner Apr 9, 2013  Hobbyist Digital Artist
Wait, you are going to convert every MLP fan fiction found on FiMfiction into Kindle E-books?
Reply
:iconlahirien:
Lahirien Featured By Owner Apr 9, 2013  Hobbyist General Artist
The ones I got from FimFiction are already in Epub. I use Mobi for the EqD collection, but I think Epubs are also compatible with Kindles. I don't plan to do much with the FimFiction pull other than using it as a source for the fics that are listed on EqD, but didn't get pulled into Nallar's collection (since they were links through to FimFiction instead of the normal Google Docs hosting).
Reply
:iconthelordfanboy:
TheLordFanboy Featured By Owner Apr 9, 2013  Hobbyist Digital Artist
I wish they gave free copies of FiMfictions on Kindle and Nook. But I think Fan fics are illegal to sell....
Reply
:iconlahirien:
Lahirien Featured By Owner Apr 9, 2013  Hobbyist General Artist
You can't get it through the kindle store, but you don't need to, because you can just send them over from your computer. Get Calibre. [link]

It's the best program for ebook management. It's a bit slow and methodical, but it shows progress, and you can just let it do its thing. I don't recommend trying to crunch all 40000 files, though.
Reply
:iconjasonbluefire:
JasonBluefire Featured By Owner Mar 25, 2013
I feel you insanity, I am the same way with movies.
Reply
:iconkyvex-ky-windcloud:
Kyvex-Ky-Windcloud Featured By Owner Mar 25, 2013  Hobbyist General Artist
I wish you luck on this endeavor and pray for your sanity.
Reply
:iconlahirien:
Lahirien Featured By Owner Mar 25, 2013  Hobbyist General Artist
I'm afraid my sanity is long gone. Ponies took care of that! :iconpinkiepieisinsaneplz:
Reply
:iconkyvex-ky-windcloud:
Kyvex-Ky-Windcloud Featured By Owner Aug 31, 2013  Hobbyist General Artist
There still at work taking away your sanity ain't they 
Reply
:iconlahirien:
Lahirien Featured By Owner Aug 31, 2013  Hobbyist General Artist
Reply
:iconkyvex-ky-windcloud:
Kyvex-Ky-Windcloud Featured By Owner Aug 31, 2013  Hobbyist General Artist
Mine is full of some gopy grey stuff.................wait i'm not supposed to open my own head darn why didn't you tell me.
Reply
Add a Comment:
 
×

Featured in Collections

Favorite Journals by Jon1128


More from DeviantArt



Details

Submitted on
March 25, 2013
Link
Thumb

Stats

Views
3,435 (3 today)
Favourites
8 (who?)
Comments
33
×