arrow00: (thinking)
arrow00 ([personal profile] arrow00) wrote2008-01-10 12:17 am
Entry tags:

fandom Meta: archiving stories directly from LJ

So, this is all (once again) [livejournal.com profile] cesperanza's fault. It all started with a posting she made in which...well, I'll let you read it for yourself if you're interested, but what came to me while reading it was we need better archiving. Not that existing archives aren't beautiful and whatnot, but there is no tie in with LJ, and the LJ community is thriving, and many people (such as me) are posting directly to LJ and not bothering to archive (for a variety of reasons.)

The problem with the LJ community is: stuff gets lost. I discovered that myself when digging up stories for my reclist. There are just too many wonderful gems disappearing into the mists of time. The folks at places like [livejournal.com profile] ds_weekly and others do a fantastic job of keeping us in touch with each other on a weekly basis, but they aren't searchable...and there is just no guarantee that LJ won't disappear like the wind in a desert.


The Proposed Solution

What I'm proposing is this: I'm willing to write an archiving script that will essentially pull LJ story postings and all the accompanying meta information and stuff them into a searchable database.

It would only do so for willing writerly participants. All that would be required of them is that they friend the archiving LJ users (ds_archivist or ts_archivist to start with) and follow a particular template when posting wherever they usually do, something like this:

Title: Guess Who's Coming to Dinner
Author: arrow00 (must be lj username)
Pairing: F/V
Rating: PG
Summary: Ma Vecchio and Dead!Bob find an uneasy truce.
etc. (TBD)

(Hm. I think I might actually write this story.)

The participants can post to whatever communities they like; I will simply make the archivist users join all the communities involved and check those communities and users daily for new stories.

What do you guys think? Will this fly? Are people willing to have their stories added to a centralized, searchable archive automatically?

The beauty of this system is a user can change their story and have it rearchived if they make their edits and then change the date. Sure, it will result in stories showing up more often (maybe they can preface the story with [posting updated] or something so people won't get mad.)

Obviously, there is more to think about and plan, but I find myself massively excited by this idea: it might bring the power and features of LJ to join with the missing features of an archive.

Heck, we can even (I think) include a comment box on the bottom of the story so folks can send their comments right back to the original LJ post.

Oh, and the best part about this: LJ will have already formatted the story html all pretty. I'll just grab everything between the cut tags.


[identity profile] sallymn.livejournal.com 2008-01-10 09:50 am (UTC)(link)
I only have a few stories, but any archive not on OTW I'd be happy to be in...
Edited 2008-01-10 09:51 (UTC)

[identity profile] arrow00.livejournal.com 2008-01-10 10:48 am (UTC)(link)
Do people have issues with OTW? I only heard about it a week or so ago...

My proposal is orthogonal, anyway. I am hoping we can use existing archives such as 852prospect and squidge's DSA. If OTW eventually comes out with an open source archive package that's more robust than existing archives, we could always migrate the data to a new server and set it up. (I'm hoping for better searching. At present, you can't really search the story text at the aforementioned archives.)

Wherever we do end up, however, will be decided before we start gathering the stories. So anyone who has an objection to the ultimate archive location can simply opt not to participate.

->Arrow (up late with weird stomach-fu)

[identity profile] sallymn.livejournal.com 2008-01-10 11:04 am (UTC)(link)
Do people have issues with OTW? Most people probably don't, mine's a personal one...

What's interesting is that I was thinking the other day about the possibility of doing up a "Gen TS for Slash Lovers" list (a friend and I did one - ending up with 100 stories) for Blakes 7) An archive would be a godsend if I did have a go... I do like lists {g}


[identity profile] arrow00.livejournal.com 2008-01-10 05:28 pm (UTC)(link)
Oh, me too. I adore them. But then I'm a nerd. :)

[identity profile] nos4a2no9.livejournal.com 2008-01-10 12:49 pm (UTC)(link)
Do people have issues with OTW?

This is a weird and complicated story, but essentially the people who seem to have the biggest problems with OTW have a personal issue against the folks behind the scenes (to the point where they have done fairly unscrupulous things like outing the project coordinators to RL family and employers) or because the OTW project goals coincide with other individual archive/fan history projects. There seems to be a mentality of "but I'm already doing that!" among OTW's detractors and the sense that OTW is poaching on someone else's territory. Which is...well, crap, basically, and scans to me as a bit petty considering fandom in general needs all of the independence, structural and logistic support it can get. But I don't want to get into trouble discussing the specifics of a situation of which I only have the vaguest of understandings. The goals of OTW are worthy ones, and the conduct of the group behind it has been exemplary - very calm, cool and professional. Which cannot be said of its detractors.

Anyway, as to the archiving question, I think it's a great idea - I say, the more off-LJ archives, the better! OTW is a great project and I fully support it, but it can't hurt to make fic available in a wide variety of places, particularly fandom-specific ones.

Your idea of creating a script to automatically pull LJ entries (which authors can continue to edit and update) and channel them into a searchable database sounds great! I don't have any programming ability but I'd like to offer my help and support.
Edited 2008-01-10 12:50 (UTC)

[identity profile] arrow00.livejournal.com 2008-01-10 05:45 pm (UTC)(link)
Thanks for your support, Nos.

>"but I'm already doing that!"

Personally, I'm a true believer in "tools, not rules," which is to say: people go where the tools are, and the best tool (unless there are anti-trust shenanigans like Microsoft engages in) usually wins. LJ, in this case, is the current leader.

And tools tend to inform the community they create. If a tool encourages linking (such as LJ does) people will link. If the structure of the tool forces people into isolated threads (such as LJ does) then it's more difficult to centralize the information. That's why my proposal.

In a way, what I am proposing is in keeping with OTW's precepts. We'll be using LJ's resources for our indexing.

If someone makes a better tool, it might take a while, but I do believe people will start shifting over. I'll try to make my tool flexible enough that it can handle multiple journaling sites (e.g., greatestjournal) and do the parsing based on their respective syntices.

I think I will proceed with creating the various accounts and start working on the script. Even if I can't get writers to participate, I can use it for my personal archiving purposes.

[identity profile] vsee.livejournal.com 2008-01-10 11:51 am (UTC)(link)
Hiya, arrow,

I am the "some woman" (eeep!) [livejournal.com profile] cesperanza was referring to in that post, and I was sort of quoted out of context anyway. I am very much in favor of more widespread archiving. See my follow up post over here.

Your system is something like what I was picturing. I'd like to see if it is possible to only slightly alter existing habits (something like inserting tags or whatnot) and using the existing power of the resources we use.

[identity profile] arrow00.livejournal.com 2008-01-10 05:51 pm (UTC)(link)
I agree with your follow-up post, vsee. Archiving just isn't *easy* enough, ultimately. And there are other problems with it: it's not cross-communication enough, and people's comments usually go directly to the author--this discourages commenting and community reaction. For most, you also can't edit your story afterward w/o intervention.

Plus, it doesn't have cutie icons. :)

We'll see if I can get any traction with my idea, which I think of as sort of "glue" between the best features of both methods.
ext_38484: (speaking)

[identity profile] karieflybabe.livejournal.com 2008-01-10 12:19 pm (UTC)(link)
This is why I have a separate journal for my stories, to control confusion. It may be simple to others but I like it just fine. I use a variation of your proposed template.

Title:
Author: KarieAuthoress
email: karierauthoress@gmail.com
Permission to archive: Yes, but please tell me where.
Fandom: The Sentinel
Genre:
Pairing:
Summary:
Warnings:
Notes:

For all my stories, I do this.

For TS writers though, I think there is an LJ community who is at least attempting to help us keep track of some of our stories.

I'll keep an eye on this for later...

[identity profile] arrow00.livejournal.com 2008-01-10 05:56 pm (UTC)(link)
Even with a separate journal, though, I've noticed (back when I only posted fic to mine) that people didn't realize they could find all my stories listed cleanly using the tags and the memories. Even though the links were right there in the menu! Also, it's not centralized or searchable like an archive would be.

I like your template very much. It's what I had in mind. I would have to encourage people to rate their stories in a readable way and also include pairings.
ext_38484: (speaking)

[identity profile] karieflybabe.livejournal.com 2008-01-10 08:11 pm (UTC)(link)
I actually picked up my template from Wonderful World Of Makebelieve back when I wanted to start posted my TS fic somewhere.

I agree about the tags, especially on some journal styles where there are no tags to click on and search. That can often be frustrating. Or if a writer doesn't use tags at all.

My only concern is most likely a technical one, something about the wording of your proposal. Clarification would be appreciated. Where would these stories be archived and in what format? I ask for a couple of reasons, but I want to make sure I don't put my foot in my mouth in the asking.

Nifty Tool

[identity profile] morgandawn.livejournal.com 2008-01-11 07:21 am (UTC)(link)
I am guessing here - but the tool itself will be flexible enough so you can archive into an existing archive (like the ones currently in place for DS and Sentinel) or, if there is no archive, the tool can be used to pull stories into a new archive. of course the people running the archive will have to agree to fans using the tool to push/pull the stories to the right destination.

You, the writer, post your stories in xxx community LJ. You opt-in and use the posting format. At regular intervals , the script pings the LJ community and gathers those stories from those authors who have opted in and delivers them to the designated archive

Where the stories end up - well that is something that you will know BEFORE you opt-in. The real work, IMHO will be at the back end - with the people setting up and keeping the archives going. And those archives might be all over the place depending on the fandom. Personally, I think having multiple archives makes fandom more robust and resilient and this tool will work wonderfully in that arena. It also makes life simpler for the writers - you just post to the one LJ community and your story is uploaded for you to the acrhive.


arrow00 - let me know if I have any of this backwards.

Re: Nifty Tool

[identity profile] arrow00.livejournal.com 2008-01-11 07:35 am (UTC)(link)
Nope, that's right. Hopefully owners of existing archives will allow me to run the script on their server and stuff the stories into their archives. If there isn't an existing archive or if the owners don't want to participate, we could create one on an independent server (someplace not corporate-owned, so we would have control.) But I'm hoping for the first alternative.

And, yes, participating authors would know the ultimate destination before opting in. The destination would not change without notifying authors and allowing them to "opt out" of having their stories moved.

The format would be html, the same formatting that appears on the writer's lj. I would just grab everything from between the cut tags. So I would require that the writer *use* lj-cut to indicate the start of their story.

I will write up a requirements document once I have ironed out the technical issues. But please feel free to continue to ask any questions you might have, because they reveal areas that need fleshing out.


ext_38484: (speaking)

Re: Nifty Tool

[identity profile] karieflybabe.livejournal.com 2008-01-11 12:24 pm (UTC)(link)
Ok, well here's another one for you, and this is crucial to me in my continuing work as a writer of fanfic now. What about stories being posted to Insane Journal?

Am I going to have to switch back to using both LJ and IJ in order to continue participation should I decide to "opt in" to this?

Re: Nifty Tool

[identity profile] arrow00.livejournal.com 2008-01-11 06:00 pm (UTC)(link)
I believe insanejournal runs on the same engine as livejournal; at least, it looks like it outputs html the same way. if that's so, and if they have friending just like LJ, then there's no reason I couldn't plug into insanejournal as well as any other lj-type journals (such as greatestjournal).

However, I will be running the pilot program off of LJ, and hammering out the kinks there, since LJ has the most users. Other journals would come after the primary functionality had been worked out.
ext_38484: (speaking)

Re: Nifty Tool

[identity profile] karieflybabe.livejournal.com 2008-01-11 09:48 pm (UTC)(link)
Suits...

The only concern I would have right now would be using the other archives and html... 852 prospect and WWOMB use text format for a reason. They do not except html.

Re: Nifty Tool

[identity profile] arrow00.livejournal.com 2008-01-11 10:27 pm (UTC)(link)
Both of those archives serve up html, it's just that when you submit stories you have to do so as text files.

852prospect, once you have submitted the files, saves them as .html files on the server. What I propose is to work with the archive server itself to save the stories as html on the server in the same format and stick the meta information in the db.

In the case of WWOMB it's slightly more complicated, because I believe it saves text with boldface and italic tags in the db and then formats the text on output. So, yeah, I'd have to swap out the p tags for line returns, but leave the itals and bold. I'm not sure, too, how to resolve the user issue (WWOMB has actual users who can log in and edit their stories after the fact. So there's the issue of mapping ljuser name to WWOMB username.)

[identity profile] spuffyduds.livejournal.com 2008-01-10 01:22 pm (UTC)(link)
That sounds awesome and painless. (Well, not painless for YOU, but for the people getting authomatically archived...)

[identity profile] arrow00.livejournal.com 2008-01-10 05:56 pm (UTC)(link)
Painless is what I'm going for, since that's what *works*. ;) Painful stuff gets shunted, and I do hate to be shunned.

[identity profile] laurie-ky.livejournal.com 2008-01-10 04:00 pm (UTC)(link)
I'd do it.
Laurie

[identity profile] arrow00.livejournal.com 2008-01-10 05:57 pm (UTC)(link)
Sweet!

[identity profile] pir8fancier.livejournal.com 2008-01-10 07:18 pm (UTC)(link)
Thumbs WAY up for me!

[identity profile] arrow00.livejournal.com 2008-01-11 06:49 am (UTC)(link)
Yay! \o/
jesse_the_k: text: Be kinder than need be: everyone is fighting some kind of battle (Default)

[personal profile] jesse_the_k 2008-01-10 07:58 pm (UTC)(link)
Archiving is greatness.

The rest of this post is blither.

I've just arrived at dS fandom, and I would love to have an LJ-oriented archive. I try to keep current with new fic by following the weekly communities, picking up new names from the discussion and then using that info to search on DSA, but I know I'm missing wonderful stuff.

Consider: this fandom is a decade old. Testimony to the crack that is dS, and the Properly Prepared writers who improve on canon daily. But did someone penning the very first CoTW fic, on 13 May 1998, think I'd want to read it in 2008?

[identity profile] arrow00.livejournal.com 2008-01-11 06:49 am (UTC)(link)
That's the real problem. I like the idea of the script because it can archive stuff to any number of locations, so if a fandom has more than one archive, no problem.

And blither away.

[identity profile] morgandawn.livejournal.com 2008-01-10 11:26 pm (UTC)(link)
I like this idea - if I grok it correctly, your database will only archive the tag info - being able to read the actual story texts will still be dependent on whether the community is still online (or the writer).

As a side note to the community maintainers: I encourage them to use the current LJArchive tool and back up their communities every week. I wish there was an automated tool for that.

[identity profile] arrow00.livejournal.com 2008-01-11 06:50 am (UTC)(link)
Definitely (as we talked about) I will be archiving the text of the stories as well, just in case lj communities disappear into the night.

[identity profile] morgandawn.livejournal.com 2008-01-11 07:08 am (UTC)(link)
I love the fact that this is an opt in - and love even more the idea of having multiple ways to backup and archive our fic. I am looking forward to seeing the new 'tool'

[identity profile] aprilvalentine.livejournal.com 2008-01-11 03:02 am (UTC)(link)
There's been a long discussion on SENAD recently about just this issue. Would you be doing stories for just DS fandom or could there be TS stories too?

[identity profile] arrow00.livejournal.com 2008-01-11 06:53 am (UTC)(link)
As I mentioned in my post, I would def start with ts and ds. And hope that 852prospect would be willing to participate; if not, I would set up a server somewhere off the grid (not a paid-for server, in other words, but belonging to a sympathetic geek.)

But there would be the possibility of doing this for other fandoms as well; once I have the kinks worked out it shouldn't be too hard to add fandoms to the list.

[identity profile] true-brit.livejournal.com 2008-01-11 07:28 am (UTC)(link)
Brilliant concept! But you're still going to write, right?

*regards arrow sternly from beneath lowered brows*

[identity profile] arrow00.livejournal.com 2008-01-11 07:36 am (UTC)(link)
Ah. Heh.

Yes, it did occur to me that once I get sunk into this I might not surface for a while. OTOH, I might go back and forth between coding and writing, writing and coding. They exercise two very different parts of my brain. Sometimes I dream code on the screen, and sometimes I dream of boys boys boys.

A girl's gotta have variety. :)

[identity profile] t-verano.livejournal.com 2008-01-13 01:53 am (UTC)(link)
My brain, she is not working well, but from the little I grasp of this idea, I say "Hey, cool!" (I think. But then, I don't even know what OTW is. However, anything that makes fic more findable and more safely / redundantly stored makes me very happy, so I'd be in.)

BTW -- I know I have a bunch of your stuff to catch up on. ::drools (attractively... well, or not):: RL has kept me from being on LJ much for six (freaking) months now, except for sporadic little microbursts here and there, and I have approximately six million three hundred and eighty-two fics and friends' posts to catch up on. I may require IV amphetamines (and RL is still not cooperating as fully as it might), but eventually I intend to get a little less behind than I currently am. (God, I hope so. It's driving me crazy, everything I haven't read. Or said... )

[identity profile] arrow00.livejournal.com 2008-01-14 03:45 am (UTC)(link)
My sweet! So nice to see your shiny face. I'm sorry RL is being such a drag on your fandom time. Bad life! No cookie!

However, I don't think fandom stuffies should be viewed as obligations, so please don't feel any on my account. I'm just glad you are a-ok. And I'm happy to hear you will participate in my little plan!