arrow00: (thinking)
arrow00 ([personal profile] arrow00) wrote2008-01-10 12:17 am
Entry tags:

fandom Meta: archiving stories directly from LJ

So, this is all (once again) [livejournal.com profile] cesperanza's fault. It all started with a posting she made in which...well, I'll let you read it for yourself if you're interested, but what came to me while reading it was we need better archiving. Not that existing archives aren't beautiful and whatnot, but there is no tie in with LJ, and the LJ community is thriving, and many people (such as me) are posting directly to LJ and not bothering to archive (for a variety of reasons.)

The problem with the LJ community is: stuff gets lost. I discovered that myself when digging up stories for my reclist. There are just too many wonderful gems disappearing into the mists of time. The folks at places like [livejournal.com profile] ds_weekly and others do a fantastic job of keeping us in touch with each other on a weekly basis, but they aren't searchable...and there is just no guarantee that LJ won't disappear like the wind in a desert.


The Proposed Solution

What I'm proposing is this: I'm willing to write an archiving script that will essentially pull LJ story postings and all the accompanying meta information and stuff them into a searchable database.

It would only do so for willing writerly participants. All that would be required of them is that they friend the archiving LJ users (ds_archivist or ts_archivist to start with) and follow a particular template when posting wherever they usually do, something like this:

Title: Guess Who's Coming to Dinner
Author: arrow00 (must be lj username)
Pairing: F/V
Rating: PG
Summary: Ma Vecchio and Dead!Bob find an uneasy truce.
etc. (TBD)

(Hm. I think I might actually write this story.)

The participants can post to whatever communities they like; I will simply make the archivist users join all the communities involved and check those communities and users daily for new stories.

What do you guys think? Will this fly? Are people willing to have their stories added to a centralized, searchable archive automatically?

The beauty of this system is a user can change their story and have it rearchived if they make their edits and then change the date. Sure, it will result in stories showing up more often (maybe they can preface the story with [posting updated] or something so people won't get mad.)

Obviously, there is more to think about and plan, but I find myself massively excited by this idea: it might bring the power and features of LJ to join with the missing features of an archive.

Heck, we can even (I think) include a comment box on the bottom of the story so folks can send their comments right back to the original LJ post.

Oh, and the best part about this: LJ will have already formatted the story html all pretty. I'll just grab everything between the cut tags.


[identity profile] arrow00.livejournal.com 2008-01-10 10:48 am (UTC)(link)
Do people have issues with OTW? I only heard about it a week or so ago...

My proposal is orthogonal, anyway. I am hoping we can use existing archives such as 852prospect and squidge's DSA. If OTW eventually comes out with an open source archive package that's more robust than existing archives, we could always migrate the data to a new server and set it up. (I'm hoping for better searching. At present, you can't really search the story text at the aforementioned archives.)

Wherever we do end up, however, will be decided before we start gathering the stories. So anyone who has an objection to the ultimate archive location can simply opt not to participate.

->Arrow (up late with weird stomach-fu)

[identity profile] sallymn.livejournal.com 2008-01-10 11:04 am (UTC)(link)
Do people have issues with OTW? Most people probably don't, mine's a personal one...

What's interesting is that I was thinking the other day about the possibility of doing up a "Gen TS for Slash Lovers" list (a friend and I did one - ending up with 100 stories) for Blakes 7) An archive would be a godsend if I did have a go... I do like lists {g}


[identity profile] arrow00.livejournal.com 2008-01-10 05:28 pm (UTC)(link)
Oh, me too. I adore them. But then I'm a nerd. :)

[identity profile] nos4a2no9.livejournal.com 2008-01-10 12:49 pm (UTC)(link)
Do people have issues with OTW?

This is a weird and complicated story, but essentially the people who seem to have the biggest problems with OTW have a personal issue against the folks behind the scenes (to the point where they have done fairly unscrupulous things like outing the project coordinators to RL family and employers) or because the OTW project goals coincide with other individual archive/fan history projects. There seems to be a mentality of "but I'm already doing that!" among OTW's detractors and the sense that OTW is poaching on someone else's territory. Which is...well, crap, basically, and scans to me as a bit petty considering fandom in general needs all of the independence, structural and logistic support it can get. But I don't want to get into trouble discussing the specifics of a situation of which I only have the vaguest of understandings. The goals of OTW are worthy ones, and the conduct of the group behind it has been exemplary - very calm, cool and professional. Which cannot be said of its detractors.

Anyway, as to the archiving question, I think it's a great idea - I say, the more off-LJ archives, the better! OTW is a great project and I fully support it, but it can't hurt to make fic available in a wide variety of places, particularly fandom-specific ones.

Your idea of creating a script to automatically pull LJ entries (which authors can continue to edit and update) and channel them into a searchable database sounds great! I don't have any programming ability but I'd like to offer my help and support.
Edited 2008-01-10 12:50 (UTC)

[identity profile] arrow00.livejournal.com 2008-01-10 05:45 pm (UTC)(link)
Thanks for your support, Nos.

>"but I'm already doing that!"

Personally, I'm a true believer in "tools, not rules," which is to say: people go where the tools are, and the best tool (unless there are anti-trust shenanigans like Microsoft engages in) usually wins. LJ, in this case, is the current leader.

And tools tend to inform the community they create. If a tool encourages linking (such as LJ does) people will link. If the structure of the tool forces people into isolated threads (such as LJ does) then it's more difficult to centralize the information. That's why my proposal.

In a way, what I am proposing is in keeping with OTW's precepts. We'll be using LJ's resources for our indexing.

If someone makes a better tool, it might take a while, but I do believe people will start shifting over. I'll try to make my tool flexible enough that it can handle multiple journaling sites (e.g., greatestjournal) and do the parsing based on their respective syntices.

I think I will proceed with creating the various accounts and start working on the script. Even if I can't get writers to participate, I can use it for my personal archiving purposes.