Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The League of Extraordinarily Dull Gentlemen #6

Open
cpressey opened this issue Oct 12, 2018 · 55 comments
Open

The League of Extraordinarily Dull Gentlemen #6

cpressey opened this issue Oct 12, 2018 · 55 comments

Comments

@cpressey
Copy link

cpressey commented Oct 12, 2018

The League of Extraordinarily Dull Gentlemen

Goals

  • To tell a single story over the course of 50,000 words.

(There were other goals but this turned out to be the main one.)

Code

Of particular interest are the world-description in 930 lines of Samovar and the 363-line Python 3 script that renders the generated events into sentences.

The version of Samovar used was 0.2. For more information on Samovar, see its entry at Cat's Eye Technologies or its repository on GitHub.

Novel

In celebration of GitHub's recent acquisition by Microsoft, I have provided this document for you in Microsoft Word format. It is 85 pages long and consists of 53,129 words.

(If you cannot (or prefer not to) view files in Microsoft Word format, there is also a Markdown version which you can view directly on GitHub.)

Preview

Because it tells a story over the course of 50,000 words, I feel that a single excerpt would not do it justice, so here are a handful of them.

Chapter 2

[...] Moonlight flooded in through the French window and illuminated the suit of armor. The shadowy figure rubbed his chin. An owl hooted outside and the shadowy figure froze. The shadowy figure coughed and was now sure no one else was about. An owl hooted outside and the shadowy figure froze. The shadowy figure cast a furtive glance around the room and coughed and was now sure no one else was about and examined the leather couch closely and leaned back in the leather couch and looked out the French window and examined the leather couch closely and looked out the French window and leaned back in the leather couch and got up and stretched and coughed and rubbed his chin and coughed and

Chapter 4

[...] Pranehurst put down the encyclopedia. Scurthorpe looked at Pranehurst. Throgmorton nodded to Pranehurst. Furze-Platt looked at Pranehurst. Pranehurst nodded to Throgmorton and nodded to Furze-Platt. Scurthorpe picked up the quill pen and got up and stretched and walked around the library and coughed. Throgmorton looked at Scurthorpe. "I shall write to Old Grisbourne. He will know just what to do," said Throgmorton. Throgmorton looked at Pranehurst. Scurthorpe walked around the library. Furze-Platt examined the bookshelf closely. Throgmorton brushed some dust off his coat sleeve. Pranehurst nodded to Scurthorpe and

Chapter 9

[...] Nearby there was a grandfather clock. Furze-Platt walked over to the fireplace. Throgmorton sat down on the leather chair and leaned back in the leather chair. Furze-Platt rubbed his chin. Throgmorton brushed some dust off his coat sleeve. Furze-Platt rubbed his chin and picked up the whiskey. Throgmorton looked at Furze-Platt. "I think YOU stole the silver statuette of Artemis, Furze-Platt!" shouted Throgmorton. "WHAT?" bellowed Furze-Platt. Throgmorton rubbed his chin and put down the newspaper. Furze-Platt walked around the sitting room. Throgmorton rubbed his chin and coughed and nodded to Furze-Platt. "I think YOU stole the silver statuette of Artemis, Throgmorton!" shouted Furze-Platt. "Well I never!" bellowed Furze-Platt. Furze-Platt spluttered and looked out the window.

Chapter 12

[...] Furze-Platt looked out the grimy kitchen window and put down the empty teapot and put down the empty kettle and picked up the tea infuser and picked up the empty teapot and put down the empty teapot and looked out the grimy kitchen window and rubbed his chin and picked up the cannister of tea and rubbed his chin and picked up the empty teacup and picked up the empty kettle and examined the grimy kitchen window closely and put down the empty teacup and picked up the empty teapot and put down the empty kettle and picked up the empty kettle and coughed and rubbed his chin and rubbed his wrist and coughed and looked out the grimy kitchen window and walked away from the grimy kitchen window and walked over to the oven and walked away from the oven and

(Original content of this post is retained below)


For past NaNoGenMos I've alternated between "experimental works" and generating "proper novels". Last year I did some "experimental works" so this year I guess I better generate a proper novel, hey? Not that I have the time for this.

Looking at my previous generators, The Swallows was essentially simulation-based and MARYSUE was more-or-less grammar-based. For this one, I'd like to combine the two approaches using techniques that could be called railroading (TVTropes link).

Also, both of those generators modelled the world as discrete objects, in the manner of, say, a typical text adventure game. In this one, in contrast, I'd like to model the world as a set of propositions, similar to the "database" in Prolog. (I don't think I'll actually use Prolog - I mean logic is great and all but I've never been convinced it's very good for programming in. I actually sketched a DSL for this approach a while back, but I don't think I'll use that either. With the right set of abstractions, doing it in a "mainstream" language should be fine, and Python is what I'm most used to these days.)

My hope is that those two things will work well together and will allow some more sophisticated narrative development, stuff that was kind of awkward in the previous generators, to fall out fairly naturally.

There are certainly other things I'd like to tackle, but finding the time to do what I've already described is already a stretch. But at least one deserves mentioning, which is the actual construction of sentences. The output of a simulation is a sequence of events, and yes you can write one sentence per event, but it's horrendous, even if you use a lot of templates. What would be ideal is if the actual "writing" part of the generator could construct sentences more "from scratch", based on a grammar (obviously) but not just expanding that grammar randomly (obviously) but rather reflecting the content of the events. This is obviously incredibly difficult and I'm not going to get very far in this area, but I'd regard even a tiny bit of progress here a success.

@cpressey
Copy link
Author

(The NaNoGenMo rules say I can use my issue as a "dev diary", so here are some notes.)

(1) The article on Railroading on TVTropes talks primarily about tabletop RPGs, but mentions another context where it is seen (console RPGs). Generated narrative is yet another context, and there is a common method for railroading simulations which has been mentioned in passing in past NaNoGenMos but, to spell it out:

Keep re-running your simulation until you obtain a run that has the properties you want.

Here "the properties you want" would be: meeting all the chosen plot points. This is a simple and effective algorithm, with all the efficiency of bogosort. I think it will probably be sufficient for my purposes.

(2) I did a little research and apparently there has been only one finished NaNoGenMo entry written in Prolog (though there have been at least 2 other attempteds). Prolog would give a nice syntax and an efficient implementation for writing rules, and I probably "should" use it, but I'm more interested in picking particular answers (randomly++) instead of enumerating all answers to queries. The latter is certainly more what Prolog is known for; if there is support for the former in some Prologs, I'd have to research what it is, and I suspect that whatever it is would end up being "doing programming" in Prolog and I'd rather avoid that as (as I mentioned) I find that rather unappealing.

The reason I'm hopeful about it is that, while the object-oriented approach is good at modelling the state of the physical world (the table is in a room, a box is on the table, the key is in the box), there is more state here than just that:

  • the state of the conversation (during dialogue)
  • the characters' mental states
  • the state of the story itself (e.g. what characters have already been introduced?)

In an object-oriented approach the temptation is to create separate classes for all of those. But then, when you need to make something conditional on more than one, you end up writing fairly complex code to check all the different classes of states. If, however, they're all modelled as propositions, stored in a single "database", the playing field is levelled, and you can write single, straightforward rules to check all these things at once.

That's the theory, anyway. The reality is probably that the rules become large and unwieldy and are difficult to make modular. The fact is, I'm not sure, but finding out would be one reason to try it.

(3) For construction of sentences, I guess the basic idea here would be to avoid templates and have each event associated with an independent clause tree instead; and have an unparser that renders the sentences. In between, there can be some code that rewrites trees to combine individual IPs into complex sentences, elaborating on them in the process.

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day -14:

Having just compiled Language survey 2017 I'm having serious second thoughts about using Python.

I mean it's not bad to use a programming language that you use every single day (almost) and that is by far the most-used language in NaNoGenMo, but...

I could do at least part of this in some other language; the problem lends itself to a pipeline where each phase reads data produced by the previous phase (in some intermediary format like JSON) and writes data for the next phase. Each phase could be in a different language.

Looking over the language surveys, I note that the eager functional languages seem a bit under-represented: Emacs Lisp and SML and Clojure have each been used on one entry, but no one has ever yet used Scheme or Racket or Erlang or F#.

@enkiv2
Copy link

enkiv2 commented Oct 18, 2018 via email

@cpressey
Copy link
Author

file.readlines() currently uses an iterator (in other words, it's equivalent to the old xreadlines()), so if you can fit your whole per-pass logic in a 'for x in y.readlines()' block, you can do streaming for a single pass easily -- it's sufficiently lazy.

Oh! So, what you're saying is, I should write it in Rust? I agree, that's a grand idea. No-one's used Rust yet, either, apparently.

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day -12:

It has surely not escaped your notice, dear reader, that all the above description of this entry has been technical in nature. But what's the novel gonna be about, you ask.

Well, before signing up this year, my idea was to set it in a Victorian-era gentlemen's club and, because it would contain a certain (large) amount of The Swallows-esque Brownian motion ("Pranehurst looked out the window and coughed. Scurthorpe rubbed his chin and walked over to the fireplace. Pranehurst looked at Scurthorpe" etc., etc.), I was going to call it "The League of Extraordinarily Dull Gentlemen".

Now, I'm not saying I won't still do that, but the fact is @enkiv2 kind of scooped me (by 9 hours!) on the Victorian-men-of-learning-and-leisure milieu with #5, and I don't have any really good ideas for how to move the plot along in this sort of setting (and it does need a plot, because the kernel of this project is to have a simulation railroaded by a plot), and to top it off, I now have another idea, which is not as interesting technically but which will probably make a much better story, and so I will probably let it take priority over this project.

@tra38
Copy link

tra38 commented Oct 21, 2018

That's the theory, anyway. The reality is probably that the rules become large and unwieldy and are difficult to make modular.

If that becomes a problem, you could look up Aspect-Oriented Programming. I never used it myself, but the methodology tries to deal with "cross-cutting concerns" by making the 'concerns' more modular (so you group rulesets under specific modules rather than having them "entangled" with each other). The archetypal example of AOP is "logging", and isn't novel generation nothing more than glorified logging of a person's thoughts?

@cpressey
Copy link
Author

If that becomes a problem, you could look up Aspect-Oriented Programming.

As it turns out, the instructor for the compilers course I took as an undergrad was one of the Big Names in AOP circles. So of course, a lot of the project work we had to do in that course, involved AOP, for what I was assured were very good didactic and vocational reasons, nothing at all to do with self-promotion.

In my professional life, I've never once used AOP, and in general, I've spent more time removing ill-considered "metaprogramming" from code bases than adding it in.

But that's neither here nor there. This ain't professional life, this is NaNoGenMo! Anything's worth trying!

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day -10:

Just a note relating to my earlier seemingly-agonizing-over-using-or-not-using-Prolog. There are two languages that are related to Prolog, that are both simpler, and could be interesting to look into in such a simulation as I was thinking of.

One is, if you place certain restrictions on Prolog, you get Datalog, which, if you like CS theory, has the neat property of being P-complete. You may have heard of the "does P=NP?" problem, and you may have heard of NP-complete problems, and you may well wonder, well, are there also P-complete problems? Yes there are, and answering a Datalog query is one of them. For a simulation where there are lots of rules but none of the rules is particular complex by itself, Datalog might be sufficient.

The other is miniKanren, which I've been peripherally aware of for a long time but I've never quite figured out. Like Prolog, it lets you write logic programs, but it's much simpler. It's also traditionally embedded in some other language (traditionally, Scheme) and presumably interfaces well with it - i.e. you could probably do the "programming" in the outer language and just call miniKanren when you need to do a logic query (which would probably ease my concerns about "programming in logic".) So that would be something that would be nice to learn. It's always seemed like a really opaque topic, though, which is why I haven't yet. One gets the impression you're supposed to learn it by implementing it in your favourite programming language. There's an interactive tutorial but when the first example is not a legal program and the second example evaluates to "the reified value of an unbound query" it really makes me wonder what the word "tutorial" means to the author.

Anyway I now have no clear idea what I'm doing this November. So there.

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day -6:

I now realize fully that you don't need anything as powerful as Prolog, or even Datalog, to work with the "propositions world" I was envisioning, because the "propositions world" consists entirely of facts, no rules, and most of the complexity in logic programming comes from evaluating rules. If you only have facts, you only need a kind of pattern-matching. For example,

character(alice).
character(bob).
weapon(gun).
drink(gin).
holding(alice, gun).
holding(bob, gin).

Then if we want to run a query like

character(C), weapon(W), holding(C, W).

to find all the characters who are holding a weapon, we simply look through all the propositions for one that matches character(C). We find character(alice), which matches with C=alice, so we replace C with alice in the rest of the query to get

weapon(W), holding(alice, W).

and we recursively run that sub-query, where we'll find a match for weapon(W) with W=gun, and our second sub-query will be

holding(alice, gun).

which we will find amongst the propositions when we look, so there are no more sub-queries to run, so yay, we got a match (C=alice, W=gun) and we record it somehow (append it to a list or whatever.)

Since we're running the sub-queries recursively, all we have to do is return from a sub-query, and the parent query will continue running where it left off; in this way it will return all matches. (You have to be careful to forget about variable bindings from previous match attempts, but that's straightforward if you allocate new data structures each time, instead of mutating them.)

This is like only 20% of the way to actual logic programming, but all the same, it sure beats writing conventional imperative (or even functional) code to manually look for all characters who are carrying a weapon.

@enkiv2
Copy link

enkiv2 commented Oct 26, 2018 via email

@cpressey
Copy link
Author

we recursively run that sub-query

By doing this, you have implemented rules (modulo the problem of re-ordering parameter lists).

No, by doing this, I have implemented search. A rule, as the term is understood in logic programming, is a different thing.

And I honestly have no idea what you mean by "re-ordering parameter lists", because even if I did want to implement rules, I'm sure I wouldn't have to "re-order" anything. Did you mean "renaming variables"?

@enkiv2
Copy link

enkiv2 commented Oct 26, 2018 via email

@cpressey
Copy link
Author

substantially closer to a trivial constraint solver than you're making out to be

I have no idea how close it is to a "trivial constraint solver" because what does that even mean? Maybe it is a "trivial constraint solver". I could call it that and how could anyone say I was wrong?

The comparison I actually made was to "actual logic programming". Maybe the estimate of "20% of the way" is a bit low, but I stand by the idea that there is quite a bit more work you have to put into this to make it able to handle rules as well as facts (full unification and variable renaming, mainly, neither of which is particularly simple to get right, at least not for those of us who don't casually describe problems in terms of Herbrand universes.)

@enkiv2
Copy link

enkiv2 commented Oct 26, 2018 via email

@hendrikboom3
Copy link

Sounds like you're doing a more sophisticated version of what I'm trying in issue 16. I will probably be writing in Go. I've already coded a unification algorithm (which I admit is trivial progress). We'll see if I come up with anything, and I'll probably be following your progress with great interest.

One of the previous attempts that inspired me is likely your Swallows. I probably misremember it. Many of my original ideas arise from misremembering other peoples' work.

@cpressey
Copy link
Author

@enkiv2

It seems like real unification is neither necessary nor desirable here, though.

I agree; didn't I already say basically that when I introduced my post yesterday with

I now realize fully that you don't need anything as powerful as Prolog, or even Datalog, to work with the "propositions world" I was envisioning

?

And the rest of that post was exclusively about that aspect (making queries in a propositions world, in contrast to logic programming); in order to focus on it, I completely ignored the other aspects of the project. Then you said,

we recursively run that sub-query

By doing this, you have implemented rules (modulo the problem of re-ordering parameter lists).

By "this" I thought you were referring to the aspect I was talking about in the post (making queries in a propositions world, and in particular, recursively running a subquery).

But I guess, now, that you were actually thinking about the entire project when you said that, including event simulation and railroading?

Whichever way it is, figuring out what you're getting at is, to be honest, kind of exhausting, and at the same time, you don't seem to be putting in a lot of effort to comprehend what I've written.

Please forgive me if I choose not to respond to your comments on my dev issue in future, as this is not my idea of a good time.

@cpressey
Copy link
Author

Sounds like you're doing a more sophisticated version of what I'm trying in issue 16.

Not sure if this is more or less sophisticated than that, but, in the general sense of trying to combine simulation with structured plot, yes, these seem like similar ideas.

There was no plot at all in The Swallows, if your definition of "plot" means it has to have a resolution. It was only a collection of intrigue tropes. By the end, the model of the characters was somewhat sophisticated (Alice could know that Bob thought an item was in a certain place when Alice had actually moved it, etc,) but that was all the result of hacking over the course of the month, than of any planning.

@hendrikboom3
Copy link

Sounds like you're doing a more sophisticated version of what I'm trying in issue 16.

Not sure if this is more or less sophisticated than that, but, in the general sense of trying to combine simulation with structured plot, yes, these seem like similar ideas.

There was no plot at all in The Swallows, if your definition of "plot" means it has to have a resolution. It was only a collection of intrigue tropes. By the end, the model of the characters was somewhat sophisticated (Alice could know that Bob thought an item was in a certain place when Alice had actually moved it, etc,) but that was all the result of hacking over the course of the month, than of any planning.

I was quite impressed by Swallows, however you deprecate your own work as "hacking". It was a real attempt at gaining some kind of consistency in the story, even though it didn't seem to have any kind of direction. The dead body in the bathroom was an inspiration. It added tension to the story just by being there.

Being impressed is probably why I'm trying to mimic or improve upon those ideas.

But hacking over the course of the month is my method, lacking any kind of a clear, implementable spec of what constitutes a good novel. I may get around to implementing some sort of planning, but a month is short.

It's only if I see some kind of pattern in the hacks that I'll consider I've made real progress.

@cpressey
Copy link
Author

I was quite impressed by Swallows, however you deprecate your own work as "hacking"

Okay, a better idiom than "hacking" in this context might be writing [code] by the seat of your pants. :)

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day -2:

OK, first, at the risk of repeating myself: the technical ideas in this project are interesting (to me), but I don't have any really good ideas for plot or characters or setting to go along with them, so please don't expect the resulting novel to be any good at all.

That said. A lot of the ideas in this project are continuations of the ideas in Samovar. I looked at its code last night and cringed (well, it was a prototype). It would directly benefit from using the propositions-world query I described previously. I think it would be not hard to add negation now as well.

Now, as I said, I don't really want to use Python this year, and I'm nonplussed by the name "Samovar" and I don't remember why I gave it that name -- but those are, I suppose, minor considerations. I've only got a month. I can always translate it to Julia and rename it "Boxlozyte" later.

What I'm getting at is, it might play out like this: I might use NaNoGenMo as an excuse/opportunity to improve Samovar and make another release of it. Then (maybe) use that to generate a novel, one that tries to meet some of the technical goals I've mentioned, regardless of how poorly it works as a novel.

@tra38
Copy link

tra38 commented Oct 31, 2018

OK, first, at the risk of repeating myself: the technical ideas in this project are interesting (to me), but I don't have any really good ideas for plot or characters or setting to go along with them, so please don't expect the resulting novel to be any good at all.

Here's a possible idea for you to look at.

When I saw Samovar, I thought about using it to "simulate" corporate warfare within a Cyberpunk city using some factions from an abandoned cyberpunk game I tried to build. You can have multiple corporations, several "industrial sectors" (Electronics, Energy, IT, etc.) and MacGuffins to fight over, and then rules to justify when a corporation declare war, when they declare peace, when they win/lose battles, etc. You can also have internal events as well - such as a corporation reforming themselves, focusing on research, etc. And then you might have other entities as well - hedge fund owners trying to manipulate the stock prices of the megacorps, police officers trying to enforce the law, etc.

@cpressey
Copy link
Author

That could be interesting; it ties in with the goal of wanting to extend the scope of state that's tracked (state of the economy? geopolitical state?)

Under the current plans, though, the simulation will have a lot of random motion between plot points -- a good way to reach the 50,000 word quota, I think, but perhaps too silly for a gritty cyberpunk setting.

Factions and allegiances are a good way to generate conflict in a story, though.

@cpressey
Copy link
Author

cpressey commented Nov 1, 2018

NaNoGenMo Dev Diary, Day 1:

Good progress. I didn't imagine I would start off NaNoGenMo by writing unit tests, but Samovar is not nearly as much of an embarrassment now, and that's good.

Just don't ask where Day 0 went.

@hendrikboom3
Copy link

hendrikboom3 commented Nov 2, 2018

Care to provide a link to Samovar? It looks interesting.
The link https://github.com/catseye/Samovar gives me a 404 not found.

@cpressey
Copy link
Author

cpressey commented Nov 2, 2018

@hendrikboom3

The link https://github.com/catseye/Samovar gives me a 404 not found.

That's really weird, because that's definitely where it is (and I don't even have the means to make it private by accident, free account holder me.) Maybe a temporary problem on GitHub's end?

There are lots of changes (like, you don't have to use Greek letters for variables anymore) on the https://github.com/catseye/Samovar/tree/develop-0.2 branch, so that's worth checking out, if you do manage to get the URL to work.

@cpressey
Copy link
Author

cpressey commented Nov 3, 2018

NaNoGenMo Dev Diary, Day 3:

Not much progress today, so here's an unnecessarily verbose report on the current status.

1/3 Simulation

I've implemented the "propositions world" query in Samovar, and it works fine. I'm very pleased that it is not a full inference engine, yet it suffices for my purposes.

Samovar has a NOT operator, so I had to add a "failure as negation" step (search for term in database, and only succeed if you can't find it.)

I will also need to add some kind of binding-uniqueness condition since I don't really want to keep seeing "Alice looked at Alice."

2/3 Story

Several participants this year have mentioned readability. I dealt with the issue of readability in A Time for Destiny in 2015, and readabillity is not my goal here. Instead, my goal is to tell a story over the course of 50,000 words. (A Time for Destiny arguably did that, if you look at the "She Gets the Guy" subplot, but in this case I mean, a single story, not an arc over many episodes.)

In the absence of any better ideas I'm settling on "The League of Extraordinarily Dull Gentlemen". Synopsis:

The golden falcon that normally resides on the mantel in the Great Hall, much beloved of all members of the Widgets Club, has been stolen. Accusations are made, improprieties surface, diaries are burnt. Also, Pranehurst has misplaced his umbrella, and asks people he meets if they've seen it, a lot. It is
eventually discovered that [SPOILERS REDACTED]. In the end, Pranehurst buys a new umbrella.

It's unlikely that that alone will fill 50,000 words so there will almost certainly be a large amount
of padding, of the Deep Hurting variety, around the middle.

3/3 Diction

I have decided that Samovar will not do any stylistic processing of the text. It will produce a straight-up "caveman narrator" series of sentences which bluntly describe the events.

Instead, I'm planning to write a separate tool that I'll use to post-process those gormless utterances into something nominally less grating to read.

In theory this other tool could be used on other texts, but I'm sure that's just a theory; the goal will only be for it to work for my use case (if I even get this far.)

@cpressey
Copy link
Author

cpressey commented Nov 4, 2018

NaNoGenMo Dev Diary, Day 4:

This is more on the "meta" side of things, but, I just wanted you all to know that I've just muted issue #2 because, seriously, c'mon.

@cpressey
Copy link
Author

cpressey commented Nov 6, 2018

NaNoGenMo Dev Diary, Day 6:

I've written out a sequence of plot points for the story and coded them in Samovar. Not every scene is fleshed-out, but they all have, like, a setting and some characters.

My generator script currently produces 20K words from this, and takes 2 minutes 47 seconds to run.

I'm backing away from the idea of having a diction filter that works on arbitrary text, mainly because of the unnecessary resources it will use on this novel (lots of sentences have the same structure and it will be parsing them over and over, and making it efficient just to handle this case doesn't seem worth the effort). Trying to come up with a more "semi-automatic" solution for that, now.

@cpressey
Copy link
Author

cpressey commented Nov 7, 2018

NaNoGenMo Dev Diary, Day 7:

One step forward, two steps back: After some refinement of the scenarios and the diction filter, the generator started producing only 15K words. This was not unexpected, as "better" sequences of events, and their descriptions, are often shorter.

After more refinement, the number started to climb up again.

And then, I found a fairly major bug in my base rules: characters weren't actually ever dropping items. Fixing that means that all of the scenarios that run until the goal "X is holding Y" is met, are now harder to satisfy, so the generator tends to take longer to run.

Which I guess is good, because it generates more words, but generation is starting to become a bit of a wait. Eight minutes!

And now I'm seeing a "maximum recursion depth exceeded" error which I haven't seen before. Will have to investigate, later on, when I can get a few minutes.

@cpressey
Copy link
Author

cpressey commented Nov 8, 2018

NaNoGenMo Dev Diary, Day 8:

The "maximum recursion depth exceeded" error was because it built a sentence tree that was too deep for Python to process recursively. Even repr(tree) would crash on it. So I put in a depth-check when creating them, so that they never get quite that big now.

Since I've set the bar at "Tell a story over the course of 50,000 words," I tried today to at least flesh out the plot points and the scenes so that the whole thing is, identifiably, a story. Nothing says it has to be a good story, of course. The point is, I have to remind myself to reach this bar first, before working on improving the little things that bug me but that don't stop it being a story.

The number of words it produces can vary a lot now, but last run it got 37K words after 4.5 minutes.

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day 12:

Not sure how much time I'm going to have to put towards this now. My own fault, I suppose, for picking such an ambitious interpretation of "generate a novel", i.e., trying to actually generate an actual novel, with plot and character development and other such tragically passéist conceits.

The setting has drifted away from Victorian to rather more 1920's, which I guess makes sense because P.G. Wodehouse is undoubtedly my biggest influence in this idiom. Though, the Monty Python sketch about the Upper-Class Twit of the Year competition should not be entirely discounted, either.

IIRC last run it got 47K words, so I'm sure it would be possible for me to find a random seed with which it gets 50K -- although, since it takes several minutes to run, actually finding that seed might take a little while. Plus, there were a few weak spots I definitely wanted to polish before release (like - if Scurthorpe puts down his drink, Pranehurst can pick up that drink, however, he should not actually drink from it. That's just bad form, what?)

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day 14:

I got a run that produced exactly 49,800 words. It took 7.5 minutes.

TIL what a "cento" is, because someone mentioned it in their issue and I didn't know what they were talking about so I looked it up.

Tomorrow's the halfway point so I'll probably post a preview, or -- since the idea of polishing it between now and the end of the month isn't very appealing to me, in large part because of how long it takes to run -- maybe just complete it. (Can always take a break and see if I feel like coming back to it and make a 2nd version before the month is over.)

@tra38
Copy link

tra38 commented Nov 14, 2018

You may have done this already, but if not...you could probably write a script to keep running the program for you until you generate a text that contains 50,000 words or more. Railroading your railroaded system, basically. Just leave the computer running for X minutes, and hope you'll get lucky.

@hendrikboom3
Copy link

Maybe leave it running overnight.

@cpressey
Copy link
Author

Railroading your railroaded system,

Yo dawg, I heard you liked railroading.

Maybe leave it running overnight.

This is a good idea. It would also give this laptop's fan an opportunity to make a joyous noise for an extended period.

However, I have found a seed that will probably suffice. The latest output has 53K words. I just want to proofread it before I release it.

In the meantime though, I've released Samovar 0.2, which is the version of Samovar I've used to write this thing.

@cpressey cpressey changed the title Railroading in a "propositions world" The League of Extraordinarily Dull Gentlemen Nov 15, 2018
@cpressey
Copy link
Author

For the benefit of anyone subscribing to notifications of this issue, I've released the code and the novel and I've updated the first post in this issue with links to them, plus some excerpts.

@hendrikboom3
Copy link

hendrikboom3 commented Nov 16, 2018 via email

@jlee50
Copy link

jlee50 commented Nov 18, 2018

This is a nice move towards simulation for narrative. I find myself going through it to see what it does and doesn't do. I like how Scurthorpe does the telling about the theft of the statuette, for example, which appears to close the possibility of other characters telling Scurthorpe back. On the other hand, Scurthorpe seems to be unable to recall having told the other characters already. And yet, there are tiimes when we forget ... 'stop me if I've told you this before, but'. And there are times when we deliberately re-tell, such as when giving reminders.

@cpressey
Copy link
Author

@hendrikboom3

I've now looked at the documentation for the current version of Samovar. I'm a former language designer (I worked on Algol 68 long ago); I have to say that this is an elegant, simple language.

I'm glad you like the design. Programming languages were, for a long time, my primary interest. Maybe they still are, I'm not sure.

I agree that Samovar was a "success" here in the following limited sense: in the first comment on this thread I said I thought I wouldn't use this DSL because

With the right set of abstractions, doing it in a "mainstream" language should be fine

But after using it, I don't think that's true. I think I'd be hard-pressed to find a mainstream language that would have let me write the story as quickly/easily/fluidly, having to embed it in the language itself. The programming-related syntax is always one extra thing you have to think about. Maybe there's a mainstream language out there that has good support for getting around this, but I'm not aware of one.

At some point above I also mentioned rewriting it in Julia. That was in jest, but, actually, I've been looking into Julia. I thought it was something like R or Matlab and I have essentially no interest in data science or numerical computing, so I thought I wouldn't care for it, but actually it's perfectly suitable for general purpose programming, and quite interesting from a programming language design perspective.

@hendrikboom3
Copy link

hendrikboom3 commented Nov 20, 2018

As a language afficianado, you should look into Racket. On the face of it, It's an implementation of a dialect of Scheme. But what recommends it is its language choice line at the start of every file. Racket is designed to allow lots and lots of syntaxes and semantices (what's the plural for semantics, anyway?) to be used compatibly in a multi-module program -- each module in its own language. There are mechanisms in the system to define these on top of Scheme. Syntax can be defined by a grammar that translates to abstract syntax; semantics can be defined by macros or interpreter or explicit code to translate to Scheme (as I understand it; I haven't done this myself -- yet).

So you can implement each part of a system in the language appropriate to it.

Racket even has an Algol 60 implementation somewhere.

And it has a very helpful mailing list. People actually like answering questions, although if it looks like a homework question, they only offer hints.

@hendrikboom3
Copy link

hendrikboom3 commented Nov 20, 2018

I've been using Go lately. To search for info about Go, though, I search for golang. The word "Go" has too many meanings. Names should be more unique than that, to use a turn of phrase that would upset my old English teachers.

@cpressey
Copy link
Author

@jlee50

On the other hand, Scurthorpe seems to be unable to recall having told the other characters already.

It is of course possible to make the characters "aware" they said something and not repeat themselves, but

a) in this generator, it's a bit fiddly and tedious to do so for every case
b) it tends to bring the word count down

So once I realized the goal was to tell a story over the course of 50,000 words, and that it didn't have to be a good story, nor did it have to tell it well, I stopped caring so much about little things like that.

(The choice of setting was also important for this reason. I'm banking on the reader having an easier time accepting the idea of idle rich men moving randomly and acting flakily, than, say, paramedics.)

But why did I not realize that was the goal, until halfway through? Because at the start it was buried under a heap of technical goals that looked unrelated.

But why that goal? Because it's quite common for a NaNoGenMo entry to take the approach: "Let's generate X's, and each X will be on average N words, so let's generate a series of (50000/N) of them and call it a novel." It's straightforward but it produces objects that you could call unrealistically excessive (e.g. a role-playing gamebook with 1000 sections). I of course don't have anything against unrealistically excessive objects per se but do I prefer it when they deal well with their own absurdity. (I'm not a fan of objects that are unrealistically excessive simply for the sake of being unrealistically excessive, you could say.) At any rate I wanted to do something different.

@kleer001
Copy link

Interesting!

Where can I find out more about this samovar scene format? I've done some googling, but can't seem to find anything.

@hendrikboom3
Copy link

https://github.com/catseye/Samovar/tree/0.2
See the doc subdirectory.

@cpressey
Copy link
Author

@kleer001 @hendrikboom3 Sorry, I thought I had updated the top comment in this issue with a link to Samovar, but I hadn't. I have now. (I've also moved where the novel's code is located, fwiw).

In fact, Samovar 0.2 has been released so you can just go to https://github.com/catseye/Samovar and read the README and the docs from there. They're not fantastic but you can probably muddle through them.

One thing I'd like to write up someday is a comparison between Samovar and Sea Duck, because they're not dissimilar, but they're also quite different.

@hendrikboom3
Copy link

Care to provide a link to Sea Duck? Googling gives me lots of waterfowl information.

@cpressey
Copy link
Author

References for Sea Duck:
mentioned in the Resources thread here: #1 (comment)
GitHub repo: https://github.com/aparrish/seaduck

@cpressey
Copy link
Author

NaNoGenMo Dev Diary, Day 30:

Retrospective.

I regret that I never made the characters say "I say!".

I regret that I never described them as wearing bowler hats (at least in the one scene that happens outdoors.)

I regret that I didn't think up the name "Faffchester" until just yesterday.

A big reason why I stopped developing this after the 15th is that it takes so long to run. This is basically inherent in the approach of making everything progress randomly and just hoping it all works out. At some point (around the 21st?) I did come up with an alternate method which (I think) produces very similar results but is much more efficient. But I didn't want to spend the remainder of the month working on it. I'll get to it in a second.

First, note that there is sort of a natural definition of "actor" in this setting: the actors in a scene are all the things that, if you removed them from the scene, nothing would happen anymore. You could rephrase this more technically, something like, the set of propositions that can participate in a pattern-match of one of the rules. Anyway, you can cause a scene to terminate by removing all the actors. I do that once or twice in the novel. But in other places it just runs on and on after the goal has been met.

Actually I'm not sure that's relevant to the more efficient method, now. But whatever, I wanted to note it.

The more efficient method would be this: make an actual planner/solver for Samovar -- one that searches for a path from the initial state to the goal (or vice versa) and returns the first one it finds. This alone would result in a novel that would of course be far too short for NaNoGenMo. But what we do, after finding this path, is to decorate it: rewrite chicanery and noise into it, as long as that chicanery and noise doesn't violate the consistency of the path. By "chicanery" I mean, if at some point Faffchester is standing, you can insert "Faffchester sat down on the couch. Faffchester got up off the couch", and you still have a consistent chain of events (and now it's longer and you're closer to finishing NaNoGenMo). By "noise" I mean things like "Faffchester looked out the window" which don't change the state at all (but bring you closer to finishing NaNoGenMo.)

I imagine the results would be similar to the purely-a-random-excursion-which-we-discard-if-we-don't-like technique. It should also be a lot more efficient.

Well, maybe next year.

@mathias
Copy link

mathias commented Nov 30, 2018 via email

@hendrikboom3
Copy link

"But what we do, after finding this path, is to decorate it: rewrite chicanery and noise into it, as long as that chicanery and noise doesn't violate the consistency of the path."

That's what IBM did in the 60's or early 70's to debug their PL/1 compiler. THey started with asimple program of known behaviour and introduced random changes that preserved semantics. Things like making a statement into a procedure and calling it. Like raising an exception an then handling it. Like adding zero to a variable. And so forth. They wrote a program that would make multiple changes randomly. It ran night and day producing programs and letting the compiler compile them checking the output.

So this technique is part of an old tradition. In AI deduction, of course, the trick has always been to eliminate all these inefficient detours. Here, and in compiler testing, we want them.

@hendrikboom3
Copy link

I'm going to to on working on these things. Not stopping merely because November is over. Considering my first nanogenmo was actually in June by myself a few years back ... the advantage of November is that there are others (such as you) to interact with. I propose to continue discussion as I continue. If you would continue at least commenting on what I do I'd appreciate it. I realize you have other demands on your time so you probably won't be rewriting Samovar much.

Would the generative text mailing list be appropriate?

@hendrikboom3
Copy link

hendrikboom3 commented Dec 1, 2018

What you have in Samovar is a way of expressing axioms for a concatenative language, I played with this stuff a few years back but the chicanery you mention is a compositional operation rather different from the usual concatenation (and function calls) in Forth. We have a parallel execution semantics here.
If the chicanery consists of "Faffchester puts down the newspaper; Faffchester picks up the newspaper", the safest thing is to put that in anywhere Faffchester happens to be holding a newspaper. It doesn't change the semantics.
But putting it in as a single unit is a mere plot delay. To make it more engaging (especially if the newspaper is likely of interest to the reader) I'd want the two to be separated by other stuff (suspense?). So this chicanery becomes a plot thread that runs in parallel with the existing story, and the individual actions can get spliced into disparate locations, as long as the intervening actions in this thread and the main thread don't interact badly (judged by the propositions they depend on as their current states). This kind of parallelism isn't used in the usual concatenative languages, which follow a very strict stack discipline of do something and finish it before going back to what you were doing before.

@cpressey
Copy link
Author

cpressey commented Dec 4, 2018

@hendrikboom3

If you would continue at least commenting on what I do I'd appreciate it.

Maybe opening an issue on https://github.com/hendrikboom3/nanogenmo2018 would work best? I mean I can't guarantee how much time I'll have to talk about it, but, sure, I can watch the repo for updates, and chip in my thoughts when I can.

I don't expect to work on Samovar again for a little while.

the chicanery you mention is a compositional operation rather different from the usual concatenation (and function calls) in Forth

I think of it as "replacing equals with (bigger) equals" in something that's more like an execution trace than a program.

By the way, this is basically how MARYSUE generated plots: instead of "X sits down in chair"/"X gets up from chair", it was "X is kidnapped by the bad guy"/"X is rescued by the good guys".

Around the start of the month I considered trying to generate the plot for this one, versus writing it by hand. My storytelling skills are essentially zero, so it was a tough call -- neither option looked significantly easier.

But if I had gone with generating the plot, I would have wanted to do something more sophisticated than what I did in MARYSUE, which I still only have vague ideas about, so I probably would've had to scale back and wouldn't have been happy with the result if I had gone that way.

I remember reading someone's advice for plotting once (I'd have to dig this up) which involved a matrix: 1 row per character and 1 column per chapter. If a chapter focused on one or more characters, they'd shade in those squares. They said this let them see things like when a character has been away from the story too long and should make a re-appearance, etc.

Also, people use the phrase "plot threads" and I do think a data structure for describing a plot, if it worked well, would probably consist of "threads" too.

But plotting is hard and November only hath 30 days.

@mathias
Copy link

mathias commented Dec 4, 2018 via email

@cpressey
Copy link
Author

@mathias

Any thoughts on writing something like Samovar as data structures in a Lisp? Then one could use macros and data transformation functions on the Samovar data itself.

Well, if I hadn't had an experimental DSL just hanging around that happened to be a good fit for what I wanted to do, it's likely I would've tried to write it in Scheme. In which case, yes, ideally it would've let you say something like

(rule
  ((actor A) (snack S) (holding A S))
  "?A ate the ?S."
  ((! (snack S)) (! (holding A S)))
)

There would be advantages and disadvantages of this. The main advantage is, as you say, you could manipulate that as a data structure. The main disadvantage, as I see it, is that you have to think about Scheme syntax when reading and writing that. For example, you'd need to backslash-escape any double quotes in the string. It's a small thing, but having a dedicated syntax for it, made it easier to read and write. (Plus, I don't have any clever ideas for something to do with those rules, if I were to manipulate them as data. If I did, I might have further thoughts on it.)

Of course, you could combine the two approaches, by exposing a parser for Samovar as a Scheme function -- or any other language of your choosing. You're still constrained by the host language's syntax for string literals, though. It's nicest if it has something like here-docs, where you can enter basically arbitrary text, but if it doesn't even let you have multi-line strings (e.g. older Javascript), that'd just be brutal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants