Skip to content

Day 030 — bufio & Buffered I/O

Month 2 · Week 1 · ⬅ Day 029 · Day 031 ➡ · Journal index

🎯 Learning Objective

Use bufio.Scanner to read input token-by-token and bufio.Writer to batch many small writes, and know the buffer-size traps that bite beginners.

📚 Topics

  • bufio.Scanner: Scan/Text/Bytes/Err, split functions
  • bufio.Writer: Flush, batching syscalls
  • bufio.Reader: ReadString/ReadLine for long lines
  • The 64 KiB token limit and ErrTooLong

📖 Reading / Sources

📝 Notes

  • Why buffer? Unbuffered reads/writes often map to one syscall each. Reading a file a byte at a time that way is brutally slow; bufio moves data in large chunks and hands you the pieces → [[buffered-io]].
  • bufio.NewScanner(r) wraps any io.Reader. The default split is bufio.ScanLines, which yields each line with the trailing \n/\r\n stripped → [[bufio-scanner]].
  • The scan loop idiom: for sc.Scan() { use sc.Text() } then always if err := sc.Err(); err != nil. Scan() returns false at both EOF and error — Err() is how you tell them apart (it returns nil for clean EOF).
  • Text() allocates a string; Bytes() returns the token's bytes valid only until the next Scan — copy them if you keep them.
  • Swap behavior with sc.Split(...): ScanWords, ScanRunes, ScanBytes, or a custom bufio.SplitFunc. Set the split before the first Scan.
  • Token-too-long trap: a single token is capped at bufio.MaxScanTokenSize (64 KiB). A longer line yields bufio.ErrTooLong. Raise it with sc.Buffer(buf, max), or use bufio.NewReader(r).ReadString('\n') for arbitrarily long lines → [[scanner-token-limit]].
  • bufio.Writer collects writes and flushes in bulk. Flush() is mandatory — buffered bytes are lost if the program exits without it. defer w.Flush() so it runs on early return (and check its error).

💻 Code Examples

sc := bufio.NewScanner(strings.NewReader("a b c"))
sc.Split(bufio.ScanWords) // set BEFORE the first Scan
words := 0
for sc.Scan() {
    words++
}
if err := sc.Err(); err != nil { /* real error, not EOF */ }
fmt.Println(words) // 3

Full code: examples/month-02/bufio-scan/main.go · Run: go run ./examples/month-02/bufio-scan

🏋️ Exercises / Practice

Exercise Status Link
Line/word/byte counter (uses bufio.Scanner) exercises/month-02/week-1/linestats

🐛 Mistakes Made

  • Forgot w.Flush() on a bufio.Writer → output file was empty even though every Fprintf "succeeded".
  • Kept a slice from sc.Bytes() past the next Scan() → it got overwritten. Switched to sc.Text() (or copied).

❓ Open Questions

  • When is bufio.NewReaderSize worth tuning over the default 4 KiB buffer?

🧠 Active Recall (answer without looking)

  1. Q: After a for sc.Scan() loop ends, how do you tell EOF from a read error?

    A Call `sc.Err()`. It returns `nil` on a clean EOF and the underlying error otherwise — `Scan()` returns `false` for both cases.

  2. Q: Your bufio.Writer output is missing. Likely cause?

    A You never called `Flush()`. Buffered bytes stay in memory until flushed; exiting first loses them. `defer w.Flush()` and check its error.

🪶 Feynman Reflection

Talking to the OS one byte at a time is like carrying groceries one item per trip. bufio is the shopping bag: it grabs a big chunk at once (reading) or fills up before making a trip (writing). The only catch on the writing side is you must remember to actually make the trip — that's Flush.

🕳️ Knowledge Gaps

  • Writing a robust custom SplitFunc that handles partial tokens at buffer boundaries.

✅ Summary

I can scan input by line/word/rune/custom token, distinguish EOF from errors via Err(), raise the token limit when needed, and always Flush buffered writers.

⏭️ Next Steps / Prep for Tomorrow

  • Day 031: os, files, and process exit codes.

Time spent Difficulty Confidence
90 min 🟦🟦⬜⬜⬜ 🟦🟦🟦⬜⬜

Suggested commit: feat(examples): bufio Scanner and buffered writer (day 030)