[bitcoindev] [BIP Proposal] Compressed Base58 Encoding for BIP-39 Mnemonics with Multisig Extensions

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* [bitcoindev] [BIP Proposal] Compressed Base58 Encoding for BIP-39 Mnemonics with Multisig Extensions
@ 2025-07-14 16:12 Zach
  0 siblings, 0 replies; only message in thread
From: Zach @ 2025-07-14 16:12 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 6946 bytes --]

Dear Bitcoin-Dev Community,

My name is Zach and I'm a Bitcoin enthusiast interested in improving wallet 
usability and security. I'm writing to propose a new Bitcoin Improvement 
Proposal (BIP) and seek initial feedback before submitting a formal draft 
via GitHub pull request.

The proposal builds on BIP-39 by introducing a compressed representation of 
mnemonic phrases using Base58 encoding, reducing a 12-word mnemonic to 24 
characters. It also includes an optional extension for multisig setups 
(e.g., 2-of-3), where each participant's storage combines their compressed 
mnemonic with foreign xpubs and a checksum into a ~252-character Base58 
string. This aims to facilitate secure, distributed storage in 
space-constrained scenarios like physical backups or QR codes.

Below is a draft of the BIP in plain text format for review. I've followed 
the structure from BIP-2 but haven't assigned a number yet. If the 
community finds this idea promising, I'll convert it to MediaWiki and 
submit it to the bitcoin/bips repo.
------------------------------
*Bitcoin Improvement Proposal: BIP-XXXX (Compressed Base58 Mnemonic 
Encoding with Multisig Extensions)* 

*Preamble *

   - BIP#: XXXX (to be assigned) 
   - Title: Compressed Base58 Encoding for BIP-39 Mnemonics with Multisig 
   Storage Format 
   - Author: Zach corneliusamadeusacts10@gmail•com 
   - Status: Draft
   - Type: Standards Track
   - Created: 2025-07-14
   - License: BSD-2-Clause

*Abstract*

This BIP proposes a method to compress BIP-39 mnemonic phrases by mapping 
each word in the standard 2048-word English wordlist to a unique 
two-character string using Base58 encoding. This reduces a typical 12-word 
mnemonic (approximately 59 characters including spaces) to a compact 
24-character string without spaces. The encoding is deterministic, 
reversible, and leverages the existing BIP-39 word indices, ensuring 
compatibility with current seed generation practices.

Additionally, this BIP introduces an optional extension for multi-signature 
(multisig) setups, particularly m-of-n schemes like 2-of-3. In such 
scenarios, each participant's location stores their local compressed 
mnemonic alongside the extended public keys (xpubs) from the other 
participants. These components are concatenated into a single Base58 string 
with a checksum for integrity, resulting in a ~252-character string per 
location. This facilitates secure, distributed storage while enabling 
independent address derivation at each site.

This can aid in easier storage, transmission, and input of mnemonics and 
multisig data in space-constrained environments, such as physical etchings 
or QR codes, while maintaining security.
*Motivation* 

BIP-39 mnemonics are human-readable but verbose. Compressing them to 24 
characters for 12 words reduces overhead significantly. In multisig 
contexts, such as a 2-of-3 scheme with geographically distributed 
participants, each location must store foreign xpubs to derive shared 
addresses, but private mnemonics remain isolated. Concatenating the 
compressed mnemonic with xpubs and a checksum creates a compact, verifiable 
dataset (~252 characters), and supports error detection during 
transcription or recovery.

This does not replace BIP-39 or existing multisig standards (e.g., BIP-45 
for derivation paths) but provides an optional compressed representation 
and storage format.
*Specification*

Base58 Alphabet

123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

Word Index Mapping and Encoding/Decoding 

The BIP-39 English wordlist is sorted alphabetically, assigning indices 
from 0 ("abandon") to 2047 ("zoo"). The full wordlist is defined in BIP-39 
and remains unchanged.

For a word with index i (0 ≤ i ≤ 2047):

   1. Compute high = i // 58 (integer division, range 0-35 for i ≤ 2047) 
   2. Compute low = i % 58 (range 0-57) 
   3. The two-character code is alphabet[high] + alphabet[low] (big-endian 
   order, higher digit first) 

Examples:

   - i=0 ("abandon"): "11" 
   - i=2047 ("zoo"): "PJ" 

Compressing a mnemonic: Look up indices, encode each to 2 chars, 
concatenate (24 chars for 12 words).

Decompressing: Split into 2-char chunks, decode indices, look up words, 
verify checksum.

Multisig Storage Format (Extension) 

Assumes 12-word mnemonics, 111-char xpubs.

Structure: Compressed mnemonic (24 chars) + Sorted foreign xpub1 (111) + 
xpub2 (111) + Checksum (6 chars) = 252 chars.

Checksum: Double-SHA256 of decoded data bytes, first 4 bytes encoded to 
Base58, padded to 6 chars with '1's.

Verification: Extract parts, recompute checksum, validate.

Usage in 2-of-3: Each location stores local compressed + 2 foreign xpubs + 
checksum.

*Rationale* 

   - 58^2 = 3364 > 2048 for unique codes. 
   - Concatenation leverages Base58 for compactness. 
   - Fixed lengths for easy parsing.

*Backwards Compatibility* 

Fully compatible; optional extension.

*Reference Implementation *

Full Python reference implementation available upon request or in the 
formal submission.

*Test Vectors *

   - Compressed mnemonic: "1112131415161718191A1B1C" (first 6 words, extend 
   to 24 chars for full). 
   - xpub1: 
   "xpub661MyMwAqRbcFtXgS5sYJABqqG9YLmC4Q1Rdap9gSE8NqtwybGhePY2gZ29ESFjqJoCu1Rupje8YtGqsefD265TMg7usUDFdp6W1EGMcet8" 
   (111 chars). 
   - xpub2: 
   "xpub68Gmy5EdvgibQVbouHdFqKg4ZP5oVjJeJUP7yKS4xEM2it2c8EL3MEN8WC4PTEiGK8oFUaKUV4ndBKSp53Q6fFi8DwhsBS1Lxf4H9fwnK2" 
   (111 chars). 
   - Sorted: xpub1, xpub2 (assuming lex order). 
   - Data str: "1112131415161718191A1B1C" + xpub1 + xpub2 (246 chars, but 
   use full 24-char compressed in practice). 
   - Checksum: Compute as per code (will be 6 chars, e.g., "2AbCdE"). 
   - Full storage: Data str + checksum (252 chars).

*Acknowledgements* 

   - BIP-39 authors for the foundational wordlist. 
   - Bitcoin community for Base58 standards.
   - Multisig distribution insights.

------------------------------

I'd appreciate any feedback on this draft: Is the concept sound? Are there 
similar existing proposals I'm unaware of? Suggestions for improvements, 
such as in the checksum method or generalization to other word counts/m-n 
schemes?

If there's interest, I'll proceed with the GitHub submission after 
incorporating comments.

Thank you very much for your attention to this matter and any insights!

Best regards, Zach corneliusamadeusacts10@gmail•com

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/1da0467a-9cc6-4448-85b9-a9b01ef1aca7n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 9573 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-07-14 19:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-14 16:12 [bitcoindev] [BIP Proposal] Compressed Base58 Encoding for BIP-39 Mnemonics with Multisig Extensions Zach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox