· 6 years ago · Jan 12, 2020, 02:58 PM
1Welcome to the Open CYOAI Project Wiki!
2
3Here you will find most info regarding AI Dungeon 2 mods, how it functions, how to play, the different versions that exists, etc.
4
5Everything was made and based off the experiences and posts of countless different Anons from 4chan.
6
7================================================================
8
9>**DISCLAIMER**
10>
11>The pages contained in this Wiki may contain language and material that could be considered offensive/obscene/upsetting/insulting/disturbing to some people.
12>
13>The strong language used is not meant to insult or denigrate any particular person or group that may read the Wiki, nor are any references used to call out the reader meant to denigrate the related groups that are referenced. All language and interaction between the writer and the reader is meant as simply friendly banter, as well as to mock the lack of attention span or knowledge of the readers to push them to learn for themselves instead of being mocked constantly.
14>
15>The views and opinions expressed in this Wiki don't necessarily represent or reflect the views and opinions of those who own, maintain and contribute to this project, and are only relevant and/or meaningful in the context they are said as well as to serve the purpose reflected before/above.
16>
17>We are not responsible for the use or misuse of the content of this Wiki by the readers, who should use and read it at their own risk and be aware of the kind of content that may contained here.
18
19================================================================
20
21 Very brief summary
22
23================================================================
24
25>#### What is AI Dungeon?
26>AI Dungeon is an AI generated text-based adventure game with almost unlimited possibilities, where you can do literally whatever you want and the AI will respond and play Dungeon Master for you.
27
28***
29
30>#### How do I start playing?
31>[Go here](https://github.com/VBPXKSMI/Open-CYOAI-Project/wiki/Online-versions-(works-on-mobile,-use-this-if-you-have-shitty-PCs)) and pick a Colab of your choosing.
32>
33>If you can't choose, just play [this one](https://colab.research.google.com/github/VBPXKSMI/Open-CYOAI-Project/blob/master/Open_CYOAI.ipynb).
34>
35>If you want to play the official version that's inferior but just works out of the box [go here](https://play.aidungeon.io/).
36
37***
38
39>#### List with the most useful Wiki pages:
40>* [(FAQ) Frequently Asked Questions](https://github.com/VBPXKSMI/Open-CYOAI-Project/wiki/(FAQ)-Frequently-Asked-Questions).
41>* [What is top_k, Temperature (temp) and top_p](https://github.com/VBPXKSMI/Open-CYOAI-Project/wiki/A-quick-explanation-on-what-is-top_k,-temp-and-top_p).
42>* [Custom Prompts Index](https://github.com/VBPXKSMI/Open-CYOAI-Project/wiki/Custom-Prompts-Index).
43>* [General Advice and Info](https://github.com/VBPXKSMI/Open-CYOAI-Project/wiki/General-Advice-and-Info).
44>* [Colab Troubleshooting](https://github.com/VBPXKSMI/Open-CYOAI-Project/wiki/Colab-Troubleshooting).
45
46***
47
48>#### Briefing of the most useful values to alter, to change how the AI behaves (think of it like they are different game modes):
49>Temperature and top_p summary (you have to change them at the start of each new game in the modded versions):
50>* high temp - for asspulls.
51>* less temp - less random shit.
52>* top_p close to 1 = more random AI
53>* top_p close to 0 = less creative AI
54>
55>But just go with 0.9 (default) as top_p (because it is difficult to manage), and control randomness via temperature.
56
57================================================================
58
59# READ THIS!!!
60
61Every single question you might have **IS ALREADY ANSWERED SOMEWHERE IN THE WIKI**.
62
63This "Starting Page" is intended to give you a briefing of the most basic info. So **READ THIS ENTIRE PAGE** faggot, or risk being ridiculed and told to fuck off in the threads.
64
65The following are expandable tabs where the info is "hidden", **click on them and read the info**.
66
67================================================================
68
69<details>
70<summary> About the game / Basic info </summary>
71
72#
73>#### What is this place? Where am I?
74>You are inside the AI Dungeon 2 wiki, more specifically the "Starting Page and Basic Info" page that will give you an overview of the game and its versions.
75
76***
77
78>#### Ok, but what is this AI Dungeon thing?
79>AI Dungeon 2 is a text-based game where you, the player, play alongside an AI that acts as a dungeon master. Depending on what you tell the AI, it will generate random (and not so random) stories, which you can roleplay through with the AI as though it were a classic "Dungeons and Dragons" game, not dissimilar to other roleplaying tabletop/pen and paper games you usually see on /tg/.
80
81***
82
83>#### Great, but what is all of that supposed to mean?
84>Have you ever played an "RPG" (role playing game)? Like Final Fantasy or Dark Souls? No?
85>
86>Well, in those games you have your missions and quests, with the lore and story of the world already set in stone (by the developers of the game) and you, as the player, have a character to control in that world, following its rules while trying to complete your objectives, whatever they may be.
87>
88>Now, a dungeon master would be someone that is in charge of creating the missions, quests, lore, story, and rules that **YOU** would be playing with. This means that each time you play with a dungeon master everything could change depending on the whims of said person acting as the dungeon master.
89>
90>These games are usually (but not always) played with other people around a table (hence the "tabletop" moniker), inside a house/clubroom, and played using dice, with pen and paper to keep track of the results. This is generally done with a minimal amount of technology, lest they lose the low-tech, homey charm that generally attracts people to them in the first place. Hence, they're a separate and unique form of game compared to more typical digital RPGs.
91
92***
93
94>#### But if the point of tabletop games is playing in person with others, then why would I want to play with an AI?
95>Good question, so here's a few reasons.
96>
97>The potential, and fun, of this game is that **you control the story** while playing; the game is your restaurant, and the AI your maître d'. The AI will take your input, your story, and your ideas and run with them, hand-tailoring an experience built just for you, all in real-time and faster than any human dungeon master could ever hope to be. Nor do you have to handle the typical interpersonal affairs of a tabletop game: no waiting for people to all have free time, no being railroaded by a dungeon master caught up in their own grandiose stories, and no player drama to be dealing with.
98>
99>You can think of it like the difference between playing singleplayer vs multiplayer games: some people prefer the former, perhaps because they don't want to deal with people, or maybe they simply enjoy the games more when alone, but ultimately, they prefer a solo experience. In any case, both kinds of games are engaged and played with a different kind of mindset, but for those looking for that solo experience, this game is second to none in replicating the feel that classic tabletop games have to offer. Which is no small thing, given the near limitless potential and original creativity that the human mind can focus into interesting stories, but with its state-of-the-art AI and some of that special human ingenuity from you, dear reader, the game manages to do just that.
100
101***
102
103>#### Well, it's intriguing, but what can you do in the game?
104>Short answer: **anything you can dream up**.
105>
106>Long answer: the most important thing is that with this game you can potentially roleplay the settings of your dreams, be that living in a fantasy land, tending a farm with your family and living a comfy life, taking on the role of detective in the crime-ridden city of Chicago, or anything in between. It can't be overstated: anything can be done in this game so long as you have the imagination.
107>
108>Of course, to answer the question many of you probably have right now: **yes, you can also create stories featuring your waifu** in any capacity that you can think up, whether that's talking to her, living a life together, or straight up fucking her brain into mush (literally, if that's your thing).
109>
110>Which leads to one of the greater advantages that the AI has over human dungeon masters, and that is its **capacity to create remarkable stories that can cater to your most /d/egenerate fetishes or embarrassing desires**. What does that mean? That the AI is capable of creating not just believable erotic stories, but engrossing ones, all without having to deal with the awkward and erratic nature of trying to "ERP" with other people through the internet, chatrooms, or worst of all, real life.
111>
112>All of that culminates into a game that doesn't really fit into any single genre. Depending on what you want out of the game, it could be a fantasy game, a sci-fi game, a dating simulator, a (good) porn game, or whatever else you think up. **What kind of game it evolves into is entirely up to the player**, and that is one of its greatest strengths: **it gives you the freedom to dream and see it all come to life**.
113>
114>Now, how about we cut you free of that drudgery in normal games for something colorful, and see what this AI has to offer you?
115
116***
117
118>#### Alright, you've got my attention now. But how does this all work? Where's the catch?
119>Only catch is that like with any AI it can get confused by extended context, and it will require some patience before it spits out that masterpiece story you're looking for. But if that's not a dealbreaker for you, then it's easy to get started, and this Wiki will show you how to do that, tips for managing its shortcomings, and so much more.
120
121***
122
123>#### That's great and all, but how does it *really* work?
124>Brainlet warning, the intricacies of AIs get complicated and math-heavy **fast**.
125>
126>But as a brief overview for those less mathematically inclined, the AI works off probabilities. You train it on an enormous amount (literally >40GB, or right about **350 million lines**) of text written by humans and it analyzes them to figure out what word comes next in any given sentence. Then you give it a prompt to start with, and it takes off with it.
127>
128>For instance, if you were to give it **"You live in the United States"** as a prompt, the next two words will probably be **"of America"** because those are what it's seen follow "United States" the most. Of course, such predictability goes against the very unpredictable nature of natural language, so the AI contains programming (a softmax function) to never have any given option always be the answer 100% of the time. This is to encourage more varied responses and prevent looping when it runs into a predictable or generic sequence of text.
129>
130>As for the game's model itself, the default version is a 1558M (or -XL) GPT-2 model that was finetuned on 30MB of second person CYOA (choose your own adventure) data (texts) on top of the ~40GB of general web texts it was initially trained on.
131
132***
133
134>#### Ok, I've got the gist. But you mentioned models, data, training, finetuning and GPT-2. What the fuck are you talking about?
135>As was said before, AIs are not simple, and transformer models (what GPT-2 is) are some of the more complex language models of there right now. Chances are you shouldn't bother too much with trying to figure it all out, but in the next section there'll be a briefing of the more technical terms. Consider it supplementary reading for those curious about the technicalities of the games, so read on if that sounds like you and you want to know more.
136
137***
138
139>#### I might take a look at it, but you sure you are not only going to throw me a bunch of info for the entire page, right? I want to play the game now that I'm hooked, you know?
140>Rest assured, the info is only there for your own good and also to avoid unnecessary questions later on in the threads.<br>
141>And of course, the how and where to play will be discussed in one of the sections below.
142
143</details>
144
145================================================================
146
147<details>
148<summary> Technical terms / Not so basic info </summary>
149
150#
151>#### I came here of my own volition cause I seek knowledge. So let's start, what is a model?
152>So you know the files your game has in one of its folders, the one named "models"? That's the game's brain.
153>
154>Basically it is a trained brain for the GPT-2 based programs (in other terms, the brain of the AI of your game).<br>
155>The final result of training the AI on a given dataset, is the resulting "brain" that determines its behavior once its done learning.
156>
157>**In "simple" terms**: when you want to train the AI's "brain" you have the Machine Learning method you're using, a set of training data used to teach it, some way to check it's results (separate data, a competing AI, or a combination of the two), the resulting model, and whatever software you stick the model in. You basically make it try and understand fuckhuge datasets until it is better, more knowledgeable, coherent, etc.
158
159***
160
161>#### Understood. What is this GPT-2 thing then?
162>**GPT-2** is a state-of-the-art transformer-based language model, trained on a dataset of a given number of web pages. Is trained with a simple objective in mind: predict the next word, given all of the previous words within some text.
163>- A **Transformer** is a deep machine learning model used primarily for language processing, and are designed to handle ordered sequences of data, such as natural language (like in web pages).<br>
164>It is typically used for:
165> * machine translation
166> * document summarization
167> * natural-language generation (the use we are most interested in)
168> * named entity recognition
169> * speech recognition
170>
171>The folks over at [OpenAI](https://openai.com/) were the ones that released (and developed) these GPT-2 models. There are varying grades of complexities to these models.
172>Models are (ranked from least to most complex):
173>- 124M (wrongly nicknamed 117M)
174>- 355M (wrongly nicknamed 345M)
175>- 774M
176>- 1558M (the most complex one).
177>
178
179***
180
181>#### Ok, now I know what a model is and that GPT-2 is the "program" used to make the ones used in the game. But can you give a little more insight?
182>The quick rundown version is that you break up huge volumes of text into formatted, paragraph-sized chunks and feed it to an AI blackbox and let it go to town.
183>
184>What you do is feed the AI a bunch of data and it responds based off that data, that's training. The creator of AI Dungeon finetuned the largest model (1558M) with 30MB of CYOA (Choose Your Own Adventure) data.
185>
186>Some Anons are trying to, and succeded in, finetuning new models based of different kinds of data. But training an AI takes a shitload of computing power. One Anon said that he rented a super computer for $100 to try and render one, but it turned out to not being so good because you gotta spend a shitload of time properly formatting your input and making sure the perspective of the AI stays consistent. Which formatting 8 million paragraphs of text would take more than 1 human lifetime. So not only the size of the model matters, the data quality matters as much if not even more.
187>
188>There is a "How to finetune models" Wiki page planned for the future, to be able to train your own models and (more importantly) contribute to the project, also it would cointain even more detailed info about the subject. But for the time being, your best bet at going at it is using one of the many guides that are over internet, e.g. [GPT-2-simple](https://github.com/minimaxir/gpt-2-simple) is an option to get started with the smaller models and see how it goes.
189>
190>Still, it's recommended that for now you try to contribute in other tasks, the most related one to training the AI being the formatting of the texts that are going to be used in the training.
191
192***
193
194>#### Got it, what about explaining what 355M, 774M, etc. mean?
195>These are the sizes of the model (i.e. the number of "parameters" you have to tune... think of it analogously to "the number of neurons in the brain"). The base model finetuned by the original author had 1558M parameters (i.e. 1558 Million, or 1.5 Billion), which is the largest one.
196>
197>If you were to use smaller models instead the game would require less processing power to run (e.g. the 774M uses half as much RAM as the 1558M one), but potentially at the cost of worse responses (especially with anything at or below 355M... where it's like talking with a schizophrenic).
198>
199>The 124M model was just made as proof of concept, while the 355M one is not that much better than it.<br>
200>355M and lower are pointless for the game, as it doesn't matter how long you train them, their attention (i.e. how much they care about and "remember" your prompts) aren't good enough to get sane responses. So the really useful ones for the game are the 774M and 1558M models, which are orders of magnitude apart from the previous and from each other models.
201
202***
203
204>#### What does "training" mean? What is the difference with finetuning?
205>First off:
206>- Training GPT-2 =/= Finetuning GPT-2
207>
208>Training can mean one of two things, and the meanings sometimes get confused.
209>
210>**First meaning**:
211>
212>The models themselves come pretrained (that's the "P" in GPT-2) with about 40GB of general text data. **Those models are already trained**, you can't train them again or you would have to train it again on those 40GB of data from scratch (which would take you a good amount of time). When you "train" those GPT-2 models, all you are doing is **finetuning** the already pretrained models, to give it some idea of the output tone and format you want (i.e. making it more specialised, more knowleadgded about certain topics, etc.).
213>
214>**Second meaning**:
215>
216>You can "train" (**actually finetune**) an AI model, which is like a base for the AI's vocabulary, writing style, flavor, etc., by feeding it metric shittons of texts and letting it process it for a few hours/days depending on how much you give it. This is how the models the game currently uses were made.
217>
218>To avoid confussion, the summary is:
219>
220>**"Training" refers exclusively to how the GPT-2 models are already pretrained, and the data they were already trained with**.
221>
222>**"Finetuning" refers exclusively to how those pretrained GPT-2 models are put though the process of "re-training" them again but not ditching the data they already have and just adding to them more data, and customazing them to how you would like them to perform in game**.
223>
224>So when one says "I trained a 774M model on 5MB data of dragon smut" what they mean and **should have said** instead is "I **finetuned** a 774M model on 5MB data of dragon smut".
225>
226>There is only one exception where you could say "How would I **train the AI**?" because you are refering to the AI and not the model itself. But you definetly cannot say "How would I ~train a model~?" because you would (most certainly) be refering to the **pretrained model**, which can only be **finetuned** unless you were to train it from scratch which you certainly are not gonna do.
227>
228>**To make it clear**:
229>- Finetuning the existing GPT-2 model requires tens of MB of data and can be done in a day or two with a beefy computer.
230>- Training a new model from scratch requires tens of GB of data and supercomputer time to do in any sort of reasonable time frame.
231>
232>Try to **always use the correct and proper meaning of these words**, as switching back and for between them will only confuse people unnecessarily.
233
234***
235
236>#### What does the data the AI was trained on mean?
237>The data refers to the texts that you would have to feed the GPT-2 model to fine-tune the AI with.
238>
239>Although you could technically feed it whatever text you wanted and it would "work", the AI then would result in nothing more than a glorified autocomplete machine, as it would be devoid of everything that makes it feel like a game.
240>
241>So in order to addapt the AI to function (and feel) like a game, those texts **need to be formated** in a certain way (**in the second person**) or else the AI will throw inconsistencies (like switching gender, not recognizing which character did what, etc).
242>
243>Finding and collecting, but specially, formatting and curating high quality texts takes a good amount of time. Reason why it is such a big issue and we need Anons to undertake that task themselves, unless a way of doing it automatically was found (which doesn't seem likely at the moment).
244>
245>A "How to format the texts to feed the AI with" Wiki page is in progress, so that you can help to format those texts and contribute something to the project. But you already can start helping by collection texts and, when that page is ready, start formatting them.
246>
247>The original author did "considerably well" gathering his data but he could have done way better, reason why Anons are volunteering and taking up that task themselves.
248
249***
250
251>#### Well, that got confusing sometimes but I finally got it. Now, how would I go about using a different model for the game?
252>You only have 3 options:
253>- Training one yourself.
254>- Finetuning a pretrained one.
255>- Or using one the already finetuned models made by Anons.
256>
257>The first option is practically impossible to do alone.<br>
258>The second one is feasible is you know what you are doing.<br>
259>And the third one is by far the easiest one.<br>
260>
261>Where are located the finetuned models made by Anons will be discussed in the next section alongside how and where to play the game.
262
263</details>
264
265================================================================