The Global Brain - the roadmap #7064

synctext · 2022-09-26T12:00:31Z

1996 fantasies: The World-Wide Web as a Super-Brain: from metaphor to model in: Cybernetics and Systems plus Algorithms for the self-organization of distributed, multi-user networks. Possible application to the future world wide web
1999: first code: "Open Information Pools" in: USENIX conference by Tribler team

The global brain is a neuroscience-inspired and futurological vision of the planetary
information and communications technology network that interconnects all humans
and their technological artifacts. As this network stores ever more information, takes
over ever more functions of coordination and communication from traditional organizations,
and becomes increasingly intelligent, it increasingly plays the role of a brain for the planet Earth

https://en.wikipedia.org/wiki/Global_brain

Growth strategy (e.g. megalomania)

Offer value beyond the offerings of the core Internet protocols or Big Tech.

Offer perfect Google-Level search for any Bittorrent swarm through content tagging, swarm size sampling, de-duplication, and relevance ranking
Include the flow of money to directly reward artists by their superfans. Nullify cancel culture [ref].
Develop DAO technology based on Taproot and FROST
Disrupt academic publishing. Enable direct publishing by academics, calculate global citations with alternative to peer review. (see our BrainDAO prototype)
Crowdsourcing of machine learning extensions: Self-evolving AI-DAO
decentralised marketplaces with fairness and reduced fraud through collaborative trust
Growth further. Replace the global financial system with a trustworthy and fair alternative. 😲 😲 😲
Currency substitution: re-invent money. By design we replace the current one-dimensional financial system with a new form of money based on fairness, transparency and ending of tax havens for the elite

First running code (2009)

By Leonardo

Prior roadmap (2021)

Prior roadmap with a markdown focus and channels. We learned that channels requires permissions, tags are a superior permissionless method to achieve content discoverability. Key steps:

Sprint	Milestone	Description
	Wikipedia import	Convert the Mediawiki content of wikipedia into markdown. Create a single static Wikipedia Markdown dump in example Tribler channel. See offline Wikipedia for inspiration.
2021-2026	Global Brain	Incrementally grow content collection with static example channels. Let the community use our features, we showcase what is technically possible. Grow with more content and broader ecosystem. From importing all know Creative Commons content to all open-access medical and scientific based publishing

Tribler: The Global Brain in Web3

Many parties try to control The Internet. We worked now for 17 years and 5 months on technology which provides ordinary Internet users with full self-sovereign control. The goal is to design a sustainable, self-governing, self-funded, and long-enduring ecosystem. Web3 is the emerging decentralised movement for which we aim to design the architecture, first realisation, and pioneering community. Ongoing prototyping can be found here:

Our efforts are deeply political. We aim to boost the power of citizens and reduce the power of all others. Wikipedia wisdom: "a more distributed form of decision-making would decrease the power of governments, corporations or political leaders, thus increasing democratic participation and reducing the dangers of totalitarian control", [ref]

Scientific and technological challenges

Scientific challenge	Description
Content search	create a decentralised Google
Misinformation, spam and pollution	create a web-of-trust which works. Zero-trust architecture.
Incentive alignment	People contribute, instead of freeride
Decentralisation and Scalability	Works for 2 people and 2 billion people. Zero-server technology, no cloud, no JavaScript, and no parasitic browser technology in general. Based on decentralised thinking, see MIT piece.
permissionless and unstoppable
privacy and safety	Tor-based seeding protocol
self-governance	owned by both everybody and nobody
self-funding	Use our Taproot-based DAO with FROST for threshold signatures. Democratic voting on proposals, our running code. Awards based on bounties (retroactive funding mechanism)
Collective Intelligence	self-evolution. 2017 'autonomous self-replication code' with actual implementation.
composable building blocks	Realised within Superapp of loading code from Bittorrent

Related work: the global brain dream is old, attracts hippies, and fraudsters alike.
Related work by Cornell University. Abandoned in 2008, at "peak P2P" era. Their Cubit search engine looks and operates fully, but looks very 2008.

Their knowledge graph is designed for an Internet filled with gentle and non-frauding strangers. No spam, misinformation, poisoning or fraud measures are implemented (see their nice documentation):

We realising a true Cyberpunk world. Goal: Cyberpunk 2052. Our radical experiments with online democracy have replaced all analog governments.

The text was updated successfully, but these errors were encountered:

synctext · 2022-11-14T13:23:04Z

Passive Personalization: learn through consumption

Its all about the effortless and frictionless experience. Interesting viewpoint of a two-sides market and the removal of explicitly following people in TikTok. Our global brain in the early stages should yield an explicit list of #HashTags the users is interested in. These hashtags defines the social graph and also interest. The sorted list of #HashTags could be the user experience paradigm to make your personalised search profile explicit. This gives the user trust in Tribler and trust in the machine learning capabilities. It also enables the user to correct mistakes.

A single graph: combine the trust graph with the relevance graph of interesting content.

synctext · 2022-12-15T10:23:02Z

Dempster-Shafer Classifier

Current federated machine learning frameworks lack the general reasoning, attack-resilience, and intelligence for the global brain. Epistemic AI might be the theoretical grounding we need for reasoning about trust, taste, truth, knowledge, and contested concepts.

"Although it is a major step in NLP research, GPT-3.5 does not fully contain all the ideal properties envisaged by many NLP researchers (including AI2). The important property that GPT-3.5 does not have is formal reasoning."

"Dempster-Shafer Theory [DST][GS76][GS90] is a mathematical theory of evidence, offers an alternative to traditional probabilistic theory for the mathematical representation of uncertainty. The significant innovation of this framework is that it allows for the allocation of a probability mass to sets or intervals as opposed to mutually exclusive singletons. In contrast, Bayesian inference requires some a priori knowledge and is unable to assign a probability to ignorance." COPIED

"In this paper, we introduce a concept called epistemic deep learning based on the random-set interpretation of belief functions to model epistemic learning in deep neural networks. We propose a novel random-set convolutional neural network for classification that produces scores for sets of classes by learning set-valued ground truth representations. We evaluate different formulations of entropy and distance measures for belief functions as viable loss functions for these random-set networks."

TUDelft is part of the epistemic AI EU project to study this work, https://starlab.ewi.tudelft.nl/

synctext · 2023-05-22T07:09:48Z

Local-first AI is emerging

The key underlying technology for the global brain. We are still far ahead as scientists, but Reddit is catching up. This will be big and mainstream soon/sometime/possibly.
Having a 20 gig file that you can ask an offline computer almost any question in the world is amazing.
That's all. I just don't have anyone in my life who appreciates this concept beyond being happy for me when I explain it.
I'm with you, absolutely amazing but nobody I know is interested. Their loss, i have no clue why everyone isn't talking about this all the time.
HN: But fully decentralised learning is hard. Try to apply machine learning in permissionless, byzantine, unsupervised, decentralised, adversarial, continuous learning context. See my lab at Delft University focused for a decade already on "The Global Brain"[2]. With 2.3 million download and crowd-sourcing, we might get there..
commercial bots with local .PDF scanning and photo tags understanding: Mitta is an artificially intelligent document bot.

synctext · 2024-09-12T15:47:42Z

Towards a global brain for humanity

Human knowledge is expanding at a rapid rate. Inspired by initiatives such as Linux, Wikipedia, and Bitcoin we present our Internet-deployed global brain prototype. Our work is based on 24 years of prototyping cardinal components such as the distributed reputation, trust framework, secure content discovery, resource mining, and ...

GPU mining and donating to the common good
donating data
curation and trust in data
improving the algorithms of intelligence

A+ publication Venue Conference on Digital Libraries (JCDL)

Ongoing related work to build upon:

MASSW is a comprehensive text dataset on Multi-Aspect Summarization of Scientific Workflows. MASSW includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years.
{Microsoft} ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
- we further randomly sample a subset of 300 papers as core papers (to obtain a reasonably sized benchmark dataset), which means we subsequently generate and evaluate 300 research ideas for each model.
- Published LLM system prompt: You are an AI assistant whose primary goal is to propose innovative, rigorous, and valid methodologies to solve newly identified scientific problems derived from existing scientific literature, in order to empower researchers to pioneer groundbreaking solutions that catalyze breakthroughs in their fields. You are going to propose a scientific method to address a specific research problem. Your method should be clear, innovative, rigorous, valid, and generalizable. This will be based on a deep understanding of the research problem, its rationale, existing studies, and various entities.
LitLLM: A Toolkit for Scientific Literature Review
LLAssist AI-powered web app automating literature reviews. Leverages NLP and LLMs to extract key information from research articles, evaluate relevance to user queries, and streamline academic research workflows.
Scideator: Human-LLM Scientific Idea Generation Grounded in Research-Paper Facet Recombination
Has-Anyone-Done-This? LLM "Has anyone used decentralized federated learning for learn to rank?
Numerous (17) papers about scientific hypothesis generation with large language models (LLMs).
PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. (has local LLM ability)
latest: PaperQA2-version of Wikipedia covering the human proteome. The 240 articles that were graded by experts as better than existing Wikipedia are already viewable - we're generating the rest over the next few weeks!
LLM-based writing tools
- https://x.writefull.com/paraphraser
- https://www.explainpaper.com/
Lacking feature:
- CS Ranking and prestige (e.g. an AI puts you on frontpage for a whole week)
- AI alternative of Cover page of Nature or Science journal The TU Delft physicists have been publishing virtually non-stop in what are regarded as the best science journals in their field. Landing a cover story however remains an honour that merits some extra effort. https://repository.tudelft.nl/file/File_42cc0292-b933-48c4-9f06-e05019359b07
System Architecture
- Embarrassingly Parallel Training of Expert Language Models
- Mixture of A Million Experts
- Decentralized fusion of experts over networks
- Apple: Federated Unbiased Learning to Rank
- TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices
- ❤️ 😲 ❤️ Learning from Heterogeneous Data Based on Social Interactions over Graphs
  - Adaptive Social Learning
  - Adaptation in Online Social Learning
  - Interplay between Topology and Social Learning over weak graphs
  - All this theory is unproven, just a dream so far. Naive world assumptions: no attacker, no malware, no global surveillance economy. Harder challenge is to realise this in real systems, real decentralisation, and real spammers.

Global Brain: Minimal Viable Product

Search through all human knowledge
- Well Creative Commons only, due to current legal stuff
- Like Core.ac.uk and scholar.google and re-usage of https://github.com/ArchiveBox/ArchiveBox (21k stars)
- Permissionless knowledge
- Search, re-mix, and publish
IPv8 overlay of GPU boxes and seedboxes
Seedboxes of Creative Commons full-text .PDF files
GPU boxes with local-hosted LLMs
Each node supports full text search of .PDF files
RAG package for paper parsing
Remote queries of paper collections (public knowledge infrastructure)
Permissionless Knowledge
- build upon existing work
- create bundles of 100-500 papers on any given topic
- LLM RAG "understands" them
- Torrent2Knowledge
- improve typical phd student workflow around paper writing
- LLM exports the knowledge of the paper collection of typical phd student
- improved state-of-the-art learning, faster scientific workflow, superhuman publication speeds, and superior societal innovation speed 🤯
New GUI (expanded version of Tribler?!?)
- Search for Creative Commons papers
- Discovery of node content (ToDo: details)
- Explore paper clusters
- Remote LLM RAG queries
- Exact matches and Phonetic Matching

Scientific discovery automation including knowledge synthesis, benchmarking, close-loop discovery.

synctext added type: long-term type: memo Stuff that can't be solved type: project idea labels Sep 26, 2022

synctext added this to the Backlog milestone Sep 26, 2022

synctext self-assigned this Sep 26, 2022

devos50 mentioned this issue Oct 7, 2022

Redesign of the Search/Channels feature #3615

Closed

synctext mentioned this issue Feb 10, 2023

phd placeholder: "Decentralized Machine Learning Systems for Information Retrieval" #7290

Open

qstokkink removed the estimation: more than month label Aug 19, 2024

qstokkink removed this from the Backlog milestone Aug 23, 2024

qstokkink removed the type: Epic label Aug 26, 2024

qstokkink removed type: project idea type: long-term labels Sep 3, 2024

synctext mentioned this issue Sep 12, 2024

Phd Placeholder: learn-to-rank, decentralised AI, on-device AI, something. #7586

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Global Brain - the roadmap #7064

The Global Brain - the roadmap #7064

synctext commented Sep 26, 2022 •

edited

Loading

synctext commented Nov 14, 2022 •

edited

Loading

synctext commented Dec 15, 2022 •

edited

Loading

synctext commented May 22, 2023 •

edited

Loading

synctext commented Sep 12, 2024 •

edited

Loading

The Global Brain - the roadmap #7064

The Global Brain - the roadmap #7064

Comments

synctext commented Sep 26, 2022 • edited Loading

Growth strategy (e.g. megalomania)

First running code (2009)

Prior roadmap (2021)

Tribler: The Global Brain in Web3

Scientific and technological challenges

synctext commented Nov 14, 2022 • edited Loading

Passive Personalization: learn through consumption

synctext commented Dec 15, 2022 • edited Loading

Dempster-Shafer Classifier

synctext commented May 22, 2023 • edited Loading

Local-first AI is emerging

synctext commented Sep 12, 2024 • edited Loading

Towards a global brain for humanity

Global Brain: Minimal Viable Product

synctext commented Sep 26, 2022 •

edited

Loading

synctext commented Nov 14, 2022 •

edited

Loading

synctext commented Dec 15, 2022 •

edited

Loading

synctext commented May 22, 2023 •

edited

Loading

synctext commented Sep 12, 2024 •

edited

Loading