ProleWiki

1053 readers
16 users here now

ProleWiki

A community related to the ProleWiki project.

Post in this community to request articles, provide suggestions and discuss ways to develop our project

founded 5 years ago
MODERATORS
1
 
 

The idea is to have a pinned discussion post every week, similar to Lemmygrad's Theory Discussion Group community.

Every week, a different essay from ProleWiki would be picked, and we would discuss it in the thread.

Most essays are around ~30 minutes and have a limited scope. This means that they could be read in a single sitting, and readers would have a more definite idea of what they want to discuss or critique compared to longer works of theory.

This idea would have a lot of benefits:

  • More people would engage with the essays on ProleWiki, giving more feedback to the authors.
  • More engagement and visibility for the ProleWiki community on Lemmygrad, especially if the post is pinned to the home page.
  • Readers would be exposed to varied topics that they might not have considered (or had the time to read a book about).
  • Opportunity for people with limited time to engage in discussions surrounding theory and practical questions.
  • Builds more of a sense of community, as the weekly format will have regulars.

I'd like to hear your thoughts!

2
 
 

I'm trying to find this book online but there's no way to pirate it. If, you comrades able to do it and add it to prolewiki; that would be great!

goodreads link:

https://www.goodreads.com/book/show/33656291-the-civil-war-in-the-united-states

3
 
 

I accused the poster of being "rude, arrogant, unstable, and uninformed," and they have been itt; then it occurred to me maybe they clicked the drop-down arrow and it renders as shown in their screenie?

Thank you for your attention to this matter. 😏🫡

4
 
 

ProleWiki gets multiple shout-outs for the library as well as original essays on the latest episode of ProlesPod, Ep 98 - How to Study and Read Theory (ft. Andre from Weeb Revolution).

It's also linked to multiple times in the episode's show notes.

From what I understood, the guest appears to be involved with ProleWiki somehow. In any case, it was pretty cool to hear.

5
 
 

How many keywords can you stuff in a title right?

I'm posting this in the prolewiki community because we'll be discussing ProleWiki's own in-development RAG for LLMs, but first you probably saw the WSWS, i.e. the trots, published 'Socialism AI'.

In their press release, they basically self-congratulate themselves about how cool this is for the workers movement and socialism and great victory this and great victory that blahblahblah. You know how trots are.

Their system is usable through ai.wsws.org or something iirc, it's a web-interface so yes it's cool that it comes as a package you can just run from any device and don't have to fiddle with it, there's also a lot of problems with it especially when coming from self-proclaimed communists. Though with how much of a joke trots are to everyone, I feel like I'm not really throwing oil into the fire with this post lol.

We looked into how their system works because they give absolutely 0 indication on the technical implementation, and found several notices of copyright in the Terms of Service. They say that the output from their AI belongs to them, for example. Courts in the US have found that LLM output is public domain but sure I guess, not really my area of expertise.

We'll get into it.

Understanding what WSWS did

  • WSWS did not train a model from the ground-up
  • WSWS did not fine-tune an existing open-source model
  • WSWS is not running and hosting their own model.

What WSWS does (and you can find this out from just using browser tools, i.e. F12 on their homepage) is use the chatGPT and Deepseek APIs.

Their pipeline is like this (as far as we can ascertain from simple browser tools):

You send your prompt -> they add their own instructions to it -> LLM fetches WSWS blog articles to answer your prompt -> LLM reads blog articles -> LLM answers your prompt with the WSWS blog articles as sources.

This is what we call RAG, or Retrieval-Augmented Generation. The technique is legit, I'm not disputing that, it's just the way they did it is both inefficient and concerning.

The Problems I have with that way of doing things

We'll get into the technical problems when I detail what the ProleWiki MCP will look like.

it's also very closed-source and obfuscated. Mind you I did not create an account (too much hassle if I want to retain my privacy on it), but you have to understand your prompt + llm output transits through OpenAI and Deepseek. There is no privacy when using this service, it goes straight to the feds with OAI.

Secondly they sell paid tiers, starting at 5$ per month for 150 messages which is... absolutely nothing.

Thirdly everything is closed off. They did not release any documentation on how this works or how you could run this yourself.

Selling paid tiers is not a problem in itself at least for me personally. You have to break even and they do pay API access to openAI and Deepseek (though Deepseek is very cheap). The problem I have is they at least should offer an open-source implementation for people who know how to use it, at the very least make the RAG files available. This is not the case.

I'm also a proponent of paying it forward. Yes this costs them money, but they could find a way to break even in ways that don't consist of just selling another SaaS (software-as-a-service). Let people pay it forward for others or something. Accept that you will lose some money on running this and cover with dues or people in the party who have money and don't mind maintaining this service. Accept donations. Lots of ways you can do this that are not so commercial, i.e. "if you can't pay you must vacate the premises".

The technical implementation: ProleWiki MCP vs. Socialism AI

A few months ago we started working with a dev who was making the Marxists Internet Archive available for RAG use. This project evolved and they are now making a ProleWiki MCP with the pages we sent them. It'll still be RAG, but more efficient.

So first, let's look at how the Socialism AI RAG works. If you remember the pipeline:

You send your prompt -> they add their own instructions to it -> LLM fetches WSWS blog articles to answer your prompt (<-- we are here) -> LLM reads blog articles -> LLM answers your prompt with the WSWS blog articles as sources.

The problem we've found is what kind of data exactly the LLM gets access to. Imagine it like a bin the LLM can sift through to make an answer with. If you provide it with the link to the page, it parses that as html code, with all its tags, headers, script calls etc. Imagine me giving you a page full of html code and asking you "can you answer when Lenin was born from this info?" You can, but it's gonna take a while and a lot of it is simply unnecessary. And you only have this one page to make an answer. If Lenin's DOB is not neatly written on it, you have to do extra thinking to put it together (this is the context window - the LLM simply won't look through 250k WSWS articles, it has to pick and choose which articles are more likely to help answer the question).

Therefore we can optimize this bin. Instead of giving you full pages you can pick from, we can give you individual lines. In our RAG for ProleWiki, what our dev did was some math that extracts every line from our pages on the principle of 1 line = 1 idea. Then it puts these ideas together in a matrix and sorts them by semantic closeness.

What this means is if you're the LLM, you don't get a full page on the October Revolution or Lenin to answer a question with. You can see our page on Lenin is quite lengthy and if you asked a question that is not on this page when the LLM pulled it to look at it before answering (for example you can see the self-exile section is empty), it might not answer your question as best it could.

With the semantic matrix, instead of picking from pages, it picks from lines to make a coherent answer. Instead of looking at just Lenin's page and filling its entire context window with it, it looks at semantic information relating to Lenin's self-exile on ProleWiki - or other sources you add to the corpus, the 'bin' - and then makes an answer on this.

tl;dr:

This means if we have information about Lenin's self-exile on say the USSR page (because why not!), it will pull exactly that thread from that page.

And this is much more powerful than what the WSWS did and why they offer such measly usage rates. They are filling up context window and sending noise tokens because they're giving an entire ... html page instead of just the relevant content. Again - as far as we can tell from looking in from the outside.

But where does the MCP come in?

MCPs are kinda new, and were made for AI to work with. I wouldn't be the best person to explain them but basically it lets an LLM look at some data (website, files, etc) and work with that data in some way. Mostly used in agentic work, tools are exposed to the llm such as view file or edit file, so it can perform these operations itself instead of having you do it and then confirm. So if you have an agent (such as crush, our favorite here on lemmygrad), an LLM can and will view and edit the files you tell it to. These are an example of 2 tools.

With an MCP, you give the LLM access to data it can read and can also give it its own tools. You could make a tool "ProleWiki-fetch". When the LLM decides to use this tool, it communicates with the ProleWiki MCP you have installed locally and lets it say "okay, let's use the prolewiki-fetch tool to look at data from prolewiki to answer this question". Then the MCP does its magic and sends back to the LLM the information.

And not only that, but as we said you can also run this locally. We are still figuring out how we'll package all of this but most likely we'll make the source files available so that anyone can build any RAG or make their own cloud web interface if they want.

Likewise for the MCP, it will be downloadable with our source files so that you can just add it to your agent interface and start using it to query the LLM and answer with prolewiki content.

Communism is not in a position of strength currently. So, I don't see any reason we should be trying to hide and obfuscate any of our content. On the contrary, proletarian education demands it be accessible without discrimination. Unlike trots, we trust the people to make the right decisions collectively - if someone wants to use ProleWiki content to train a model and paywall that, let them. There will be 10 more that won't be.

In fact speaking of models, our dev is also working on something there... but I was asked not to say too much about it as it's very experimental 🤐

6
 
 

just joking haha... unless 😳

7
 
 
8
 
 

Like the title says. I'd like to be able to read it offline.

9
 
 

Bonjour à tous.tes,

Petit message en français pour vous dire qu'on a traduit ProleWiki vers le français depuis l'instance anglais et que nous cherchons maintenant des éditeurs.trices !

Il reste pas mal de boulot pour finir l'intégration des nouvelles pages, et j'ai préparé un petit guide qui explique où vous pouvez nous aider avec vos contributions : https://fr.prolewiki.org/wiki/Essai:Comment_aider_sur_ProleWiki_(fran%C3%A7ais) (que je vais certainement encore remplir et essayer de simplifier)

N'hésitez pas à partager un maximum, on cherche vraiment à faire vivre l'instance et la rendre autonome. Et j'espère vous voir sur ProleWiki !

10
 
 

Obviously this is still in what could be considered "late beta", but the pipeline was a huge success. https://fr.prolewiki.org/

The translation quality is honestly very good, we picked the right model and prompt for this.

This got us I would say 75-80% of the way there, the remaining % points are busywork that you won't escape, or at least I don't know how to automate it...

Think of it this way, ProleWiki EN has 5 years of organic content being written over time with links and page redirects being made. We are starting from 0. So, currently, most pages have redlinks (here's a benchmark one: https://fr.prolewiki.org/wiki/Cor%C3%A9e) because the redirects are not created. The pages exist, it's just that the links should to go, say, "Kim Il Sung" instead of "Kim Il-Sung". Normally you'd create a redirect like Wikipedia does, i.e. Kim Il-Sung takes you to Kim Il Sung. But we don't have that history so we have to create them.

We could have exported the redirects but I decided against it because it would probably be a bigger headache. Same for the templates, we're going to run them through Deepseek as needed.

Aside from that we focused on getting the triad of homepages (Home/Library/Essays) cleaned up and ready to go. Here's the essays for example: https://fr.prolewiki.org/wiki/ProleWiki:Essais

I'm hopeful that with this out of the way we will get new editors and even anonymous editors interested in participating (tomorrow I think I will open up anonymous editing on the French instance to every namespace). It'll take some time to finish cleaning everything up and tbh even the english instance isn't completely pristine. I saw some pages that I didn't even know existed and were clearly test pages from 2020 lol.

Obviously fixing these red links is not going to happen overnight, we're in for the long haul. But we got 80% out of the way in a week.

I learned some practices in regards to this pipeline, things I would do differently. Tbh we were getting kinda antsy to get this up and running. But if we were to redo this for other languages I would do some things a bit different to save on the headache.

The pipeline was: download all PW pages through API -> Run through LLM to translate from EN to FR -> use regex script to clean up translation artifacts -> upload to website.

Simple enough in theory but not so small in practice, esp. the regex to clean up the translation artifacts.

edit - oh yeah, total time from start to finish was exactly 1 week (Saturday to Saturday). This is the power of LLMs lol, you just have to find the right one and prompt it. Funnily enough I think the smaller models did a better job than the bigger models. Contrast to what 5 sleep-deprived tankies could have achieved lol

11
 
 

Only on English for now bc we need to duplicate code on all instances 😩

Just a small addition but I think it just makes sense and will probably help a lot of people out.

Also interested if someone has ideas on how to advertise this to our readers because I have no idea where to put this info. We also have a reading mode if you press the 0 key on desktop, no idea how to tell people (but I want to put that one in menu instead tbh)

12
13
 
 

Recently, English-language ProleWiki reached 5000 pages.

Personally I am pretty happy about this milestone. Having joined the project a few years ago, watching it grow over time and gain new editors and growth on the other language instances as well has been a good experience. I've learned a lot while working on ProleWiki thanks to everyone's contributions. I want to say thanks to everyone who has contributed in some way, and also to readers of ProleWiki.

To anyone who has thought about submitting an edit to ProleWiki but hasn't tried it yet, try it! Edits without an account go through a review queue. Note that your IP will be shown. If it's not just a typo correction, then the main thing you need to do is make sure you add a source to your edit. You can also apply for an account and/or join the Discord if you want to get more involved. You can also have a look at the list of wanted pages and see if there is a topic on there you're interested in contributing to. Also consider helping out to develop the other language instances. You can also submit essays or become a library editor.

Anyway, I started this post mainly because I wanted to share about the 5000 pages mark. Thanks again to anyone who has contributed, and also thank you to readers of ProleWiki. I have been very glad to be able to participate in a project like this for spreading and learning information from a ML perspective. Thanks for reading!

14
 
 

I'm very excited to announce we have started a ProleWiki Telegram broadcast channel which posts news from AES, Global South and the decline of the empire 4 times a day on the dot!

This couldn't have happened without @yogthos@lemmygrad.ml's efforts into making a custom bot for our needs. I could explain the logic behind it, but basically you get handpicked news from PressTV, CGTN, Granma, and more (and counting) trustworthy sources 4 times a day without fail. + you get a link to their story so you know we didn't just pull it out of thin air.

Feel free to join if that seems like a cool thing to you!

15
 
 

The African Liberation Reader is a 1982 compilation of writings and statements from national liberation movements in Africa, primarily from the movements in southern Africa and those who were struggling for national liberation against Portuguese colonialism. The collection was put together by the editors in 1973-4 and first published in Portuguese.

  1. The African Liberation Reader, Volume 1: The Anatomy of Colonialism
  2. The African Liberation Reader, Volume 2: The National Liberation Movements
  3. The African Liberation Reader, Volume 3: The Strategy of Liberation

Currently I am working on proofreading and formatting the text for improved readability, but they are mostly able to be read as they are now.

16
 
 

Here is their PDF (edit: reuploaded on slightly longer delay).

Also, important: we censored the identities of the profiles in this file. The original had no censoring whatsoever on the screenshots.

There are also slurs in the document, written by the wrecker of course.

Idk, there's a lot I could meme on this but I think it speaks for itself. Like I don't think I could make this any funnier than they did themselves.

So we got an editor some time back named AutisticYugoslav and apparently they wanted to get info and doxx our members, that's why they joined. When their job was done they gave us the excuse, I shit you not, that "the website logged them out", faked a meltdown, and used that excuse to leave the server lmao.

This is legit unhinged, you've been warned. It makes no logical sense and there's leaps and jumping to conclusions everywhere.

Two things emerge:

1- This is the only type of person who slander ProleWiki, without fail. What does that say about our detractors? 2- Yes, half our members are LGBT. If it wasn't clear, don't join if you're anti-LGBT. We'll eat you alive 😎

So thanks for doing some propaganda for us, AutisticYugoslav (who is actually Dutch from his own report), but you kinda failed on the agit part of agitprop. Also nobody gave a shit about your document on your server when you posted hope you don't feel bad :( thankfully the pros are here to show you how it's done.

My favorite part has to be his swastika hammer and sickle that he devotes fucking 10 pages to barking to himself that noooo it's totally not a swastika it's a minecraft flag you just don't get it liberals fascist coping

And of course they never had the balls to say any of this to our face lol. (I say they but his pronouns are he/him last we know).

My only regret is I'm mentioned in this document only two times, there was so much more stuff you could have gotten from me. Here's some for your next doxxing attempt:

I've invited other prolewiki editors to give their own takes because we just could not agree which part was the funniest lol. but yeah we found it very, very funny.

Oh, and it goes without saying, if you want to be part of this "degenerate den of western liberalism" and "dengist hive" because you feel you'll fit right in -->

Discord: https://discord.gg/ZQTBNRU9v5

Request editor account: https://en.prolewiki.org/wiki/Special:RequestAccount

Also fair's fair, here's the guy:

If you share any discord with him let people know ASAP. No hiding spot left for him.

17
1
submitted 5 months ago* (last edited 5 months ago) by TotallynotinIran@lemmygrad.ml to c/prolewiki@lemmygrad.ml
 
 

I attempted to make an account about a day ago and everything went well but after that when i wanted to try to do it again (just to check) the process was different. the form i filled only wanted a bio with "relevant qualifications" (if i remember correctly) with at least 50 words. but the second time it showed a more elaborate form with some questions which i assume is the actual one i should've filled. My question is that can i fill up the actual form with the same email on the first one now or should i wait for an answer on the email sent to me from the first attempt? Also is it possible to choose a username that uses variations of the Perso-arabic script? is that feature something being considered? if yes would it be possible to change usernames after creating an account?

18
19
 
 

With the surge of new users on lemmygrad there has been growing interest in ProleWiki on the grad. This is some info if you want to start editing, in no particular order:

1 - Lemmygrad and ProleWiki are completely separate projects. While we share some history and the lemmygrad community is sort of our outreach homebase, prolewiki has its own policies, editorship, etc. Just to clear up some misconceptions I see around :) (and so if you want to discuss prolewiki you should join our discord to talk with the editorship directly). We've evolved our own demcent, our own community, policies etc over time.

2 - With that bummer out of the way, you can either make anonymous edits to ProleWiki or request an account. We explain the difference here: https://en.prolewiki.org/wiki/ProleWiki:How_to_make_anonymous_edits

3 - if you make anonymous edits, they go through a moderation queue and the trusted editorship decides to approve or reject the entire edit. For that reason it is preferable to keep them short because if we need to decline an edit for just one part of it, we can't keep the rest of it easily.

4 - We are a marxist-leninist encyclopedia so you don't need to do a wikipedia style both sides when editing! Bring a marxist perspective with the wordage, and put the reader first.

4 - (yes there's two 4s) Please note all claims must have a source attached, you can use the Cite button and then a citation template if you use the visual editor. Edits that do not cite sources will be rejected.

5 - we consider all anon edits individually, regardless of the IP attached. We might also approve it but then undo the edit, usually with denial reason. you can then go into the page history, open your edit, and start editing from it. It's a bit complicated to explain in a couple sentences but if someone asks in the comments I can make a quick guide.

6 - we also ask that edits respect our Principles. We also have other guides for your benefit:

a. Encyclopedic tone guide b. User guide (to use the editor) c. Editorial guidelines

I would say the encyclopedic tone guide, while it could be improved in the 'fairness' section (don't hide our marxism), sets a good base for a good edit. A lot of people, editors included, remove history when it comes to unfolding or current events, but we have a task of adding and preserving history as well.

7 - you can also request an account which allows you to bypass the moderation queue and participate in the health of the project. Account requests are voted on democratically and take some time, but once you're accepted you don't have to do it again.

8 - anon edits can only edit main content pages, that is those not preceded by a namespace (E.g. Library:, ProleWiki:, Essay:, Category: etc). With an account you can edit almost the entire wiki including adding library books or your own essays.

9 - we have a seldom used library editor account, which only lets you edit the library and nothing else. You still get access to the editors-only chat rooms in the discord with it, allowing you to be embedded in the project. The idea is that we get books, and you get to read them and learn. So if you feel you don't qualify for a full editor account, you can put in your request that you want a library account, it's a bit easier to get.

10 - There's also no quotas or participation merits. You can make an account and then use it sparingly, but of course we prefer that people participate on the discord so we can get to know them and talk to them. But it's not a requirement.

11 - If we deny your account request, you can still make anon edits or try again for a new account at any time! The point is not to gatekeep but to make sure there is ideological consistency and knowledge within the project, since we are an encyclopedia.

12 - if you don't know what to edit honestly just start somewhere and see how it goes! Add sources, fix typos, add paragraphs, whatever you notice. Most of my edits are made either on topics I've researched a lot, or pages I come across and think "this could be improved".

As an editor you can also participate in more than just editing. For example we're interested in a library maintainer: someone that can provide some vision for the library space, oversee edits made on it, etc. Experience is not necessary but some vision and ideas would be nice. You can also propose your own ideas in the project for stuff you'd like to start.

20
 
 

I don't know if there's a post asking about this already, but what happened to the whole Common Software concept?

I found this webpage for the RTC:

https://comrades.sbs/

which I think is what the article on FOSS is referring to.

But I see that the pages for Common Software and Revolutionary Technical Commitee have been deleted.

Does the License still exist? What happened to the RTC? And can I find more information anywhere?

Because now it just feels like a tease, but with no link to learn more. And I've wanted to find a better alternative to FOSS for my software.

Thank you so much for the wiki. I love it and would love to keep reading it.

21
 
 

Many of the works I've seen on the ProleWiki library do not have a link to an ebook version. Even the source pages often do not.
I could find ebooks of some works on other places on the internet, but often the translations are different or they're missing the footnotes found on the ProleWiki versions.

I tried to convert a couple of them to epubs by downloading the "printable version" provided by ProleWiki and running it through Calibre, but the result is always broken.

Does anyone have suggestions for people who prefer to use ereaders?

22
 
 

What do I mean by this? Well, this is how The German Ideology used to look on ProleWiki:

https://en.prolewiki.org/index.php?title=Library%3AThe_German_ideology&oldid=85037

And this is what it looks like now: https://en.prolewiki.org/wiki/Library:The_German_ideology

In the 'before', the book was imported entirely on a single page. This is how all books on prolewiki are.

In the after, the book is on a single page, but you can also use the integrated table of contents to navigate directly inside a chapter subpage. E.g.: https://en.prolewiki.org/wiki/Library:The_German_ideology/The_essence_of_the_materialist_conception_of_history_Social_being_and_social_consciousness

This gives readers choice as to how they want to read books.

We have to run a script on every library page individually so it'll take some time and new books will probably lag behind before they're split, but this is a net positive overall

Also reminder to press '0' on your keyboard (desktop only) when using prolewiki to enable reading mode, and check the gear icon in the sidebar for customization/accessibility options.

23
 
 

The first and only online copy of this book(s) made available by a comrade who digitized all three volumes. We gratefully offered to rehost for them.

I haven't read it myself but they've shared some excerpts and it sounds like a very good read from a based (20th century) author, which is a rarity from terf island.

This is the revised edition.

24
 
 

I have finally gone around to implementing the much-awaited reading mode. This is a distraction-free reading mode that basically removes everything but the text.

Combined with the settings (the gear icon which not enough people use tbh), you can now tailor your reading experience to pretty much the level you need. Font-size, font type (serif, sans), theme (light-dark-sepia), and even page width.

For now it's kind of an easter egg because you have to know to press the 0 key, but it allows us to test the new mode and start deploying it.

Press 0 again to cancel it, or just refresh the page - it doesn't save on purpose. Reading mode only works on desktop because it kind of already exists on mobile.

It also adds a line cursor that highlights the line you're pointing over, to help you keep track of where you are on the page, or if you need it for accessibility reasons.

The goal for the reading mode is "a reading experience so comfortable you will start reading one word and suddenly hours have passed". I don't know if we'll achieve that but we'll get as close to it as possible.

It still needs some fiddling with the colors (and so does the sepia theme) so you don't have to tell me about that lol, but if you have ideas or stuff you've always wanted to have in a reading mode or similar, lmk.

25
1
submitted 9 months ago* (last edited 9 months ago) by davel@lemmygrad.ml to c/prolewiki@lemmygrad.ml
view more: next ›