Go Gopher How to Go

A rune in Go is an alias for int32 and represents a Unicode code point. Runes are essential for properly handling international text and multi-byte characters in Go.

💡 Key Points

  • rune is an alias for int32
  • Runes represent Unicode code points
  • Use single quotes for rune literals: 'A'
  • Iterating over strings with range returns runes, not bytes
  • Convert strings to []rune for character-based operations
  • One rune can occupy 1-4 bytes in UTF-8 encoding
  • Use utf8 package for advanced rune operations

Runes vs Bytes

Understanding the difference between bytes and runes is crucial for text processing:

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    s := "Hello 世界"
    
    // Length in bytes
    fmt.Println("Byte length:", len(s))
    
    // Length in runes (characters)
    fmt.Println("Rune count:", utf8.RuneCountInString(s))
    
    // Iterating bytes (wrong for multi-byte chars)
    fmt.Println("Bytes:")
    for i := 0; i < len(s); i++ {
        fmt.Printf("%c ", s[i])
    }
    fmt.Println()
    
    // Iterating runes (correct)
    fmt.Println("Runes:")
    for _, r := range s {
        fmt.Printf("%c ", r)
    }
    fmt.Println()
}
Output:
Byte length: 12
Rune count: 8
Bytes:
H e l l o   ä ¸ � ç � �
Runes:
H e l l o   世 界

Working with Runes

Convert between strings and runes for proper character handling:

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    // Single rune literal
    r := 'A'
    fmt.Printf("Rune: %c, Value: %d\n", r, r)
    
    // Unicode escape
    heart := '\u2665'
    fmt.Printf("Heart: %c\n", heart)
    
    // Convert string to rune slice
    s := "Hello 世界"
    runes := []rune(s)
    fmt.Println("Rune slice:", runes)
    fmt.Println("Length:", len(runes))
    
    // Access individual characters
    fmt.Printf("First char: %c\n", runes[0])
    fmt.Printf("Last char: %c\n", runes[len(runes)-1])
    
    // Modify runes (strings are immutable)
    runes[0] = 'h'
    modified := string(runes)
    fmt.Println("Modified:", modified)
    
    // Decode first rune from string
    r, size := utf8.DecodeRuneInString(s)
    fmt.Printf("First rune: %c (size: %d bytes)\n", r, size)
}
Output:
Rune: A, Value: 65
Heart: ♥
Rune slice: [72 101 108 108 111 32 19990 30028]
Length: 8
First char: H
Last char: 界
Modified: hello 世界
First rune: H (size: 1 bytes)

Rune Operations

OperationSyntaxDescription
Rune literal'A'Single character in quotes
Unicode escape'\u0041'Unicode code point (4 hex digits)
Long unicode'\U00000041'Unicode code point (8 hex digits)
String to runes[]rune(s)Convert string to rune slice
Runes to stringstring(runes)Convert rune slice to string
Count runesutf8.RuneCountInString(s)Number of runes in string
Decode runeutf8.DecodeRuneInString(s)Get first rune and its size

Unicode Package Functions

FunctionDescriptionExample
unicode.IsLetter(r)Check if rune is a letterunicode.IsLetter('A')
unicode.IsDigit(r)Check if rune is a digitunicode.IsDigit('5')
unicode.IsSpace(r)Check if rune is whitespaceunicode.IsSpace(' ')
unicode.IsUpper(r)Check if rune is uppercaseunicode.IsUpper('A')
unicode.IsLower(r)Check if rune is lowercaseunicode.IsLower('a')
unicode.ToUpper(r)Convert rune to uppercaseunicode.ToUpper('a')
unicode.ToLower(r)Convert rune to lowercaseunicode.ToLower('A')

Bytes vs Runes

AspectByteRune
Typebyte (alias for uint8)rune (alias for int32)
Size1 byte (8 bits)4 bytes (32 bits)
Range0-255Unicode code points (0-1,114,111)
LiteralNo specific literalSingle quotes: 'A'
Use caseRaw binary data, ASCIIUnicode text, characters
String indexs[0] returns byterange loop returns runes

Common Patterns

PatternCodeUse Case
Iterate charactersfor _, r := range s { }Process each Unicode character
Character countutf8.RuneCountInString(s)Get actual character count
Substring by charsstring([]rune(s)[2:5])Slice by characters, not bytes
Validate runeunicode.IsLetter(r)Check character properties
Case conversionunicode.ToUpper(r)Convert individual characters