MySQL Interview Questions and Answers (2026) Interview Questions

BACKEND INTERVIEW PREPARATION

MySQL Interview Questions and Answers

100 MySQL interview questions covering InnoDB internals, indexing, transactions, locking, replication & MySQL 8 features — curated for Java backend developer roles in 2026.

⏳ 60 min read 📝 100 Q&As 🎯 Beginner to Advanced

⚡ Quick Reference

InnoDB vs MyISAM	InnoDB: transactions, row locks, FK, crash recovery; MyISAM: table locks, no transactions
Default isolation level	REPEATABLE READ (InnoDB)
Index structure	B+Tree — sorted keys, all row data/PKs in leaf nodes
EXPLAIN	Shows execution plan: access type, chosen key, estimated rows, Extra
binlog	Binary log of data changes — drives replication & point-in-time recovery
VARCHAR vs TEXT	VARCHAR: length-limited, can have DEFAULT, fully indexable; TEXT: large, prefix index only
AUTO_INCREMENT	Auto-generated sequence; counter persisted across restarts since MySQL 8
MySQL 8 highlights	CTEs, window functions, invisible indexes, JSON enhancements, roles, atomic DDL

MySQL Architecture

Client (JDBC / mysql CLI)

↓

Connection Layer (auth, thread handling)

↓

Parser / Optimizer (SQL layer)

↓

Storage Engine (InnoDB)

↓

Disk (tablespaces, redo log, binlog)

MySQL Basics & Architecture

Q1. What is MySQL and why is it so widely used with Java backends?

A: MySQL is an open-source relational database management system that uses SQL and follows the client-server model. It is popular with Java backends because of its maturity, strong JDBC driver support (MySQL Connector/J), first-class integration with Spring/Hibernate, replication features for scaling reads, and a large operational ecosystem. Since version 5.5, its default storage engine is InnoDB, which provides full ACID transactions.

Q2. Describe MySQL's high-level architecture.

A: MySQL has a layered architecture: the connection layer handles client connections, authentication, and thread management; the SQL layer contains the parser, preprocessor, optimizer, and executor; and the storage engine layer (pluggable — InnoDB, MyISAM, MEMORY) handles actual data storage and retrieval through a common API. This separation lets you choose different engines per table without changing SQL.

Q3. What is the difference between the SQL layer and the storage engine layer?

A: The SQL layer is engine-agnostic: it parses queries, rewrites them, chooses an execution plan, and manages functions, views, and stored procedures. The storage engine layer implements how rows are physically stored, indexed, locked, and recovered. Features like transactions, row-level locking, and MVCC live in the engine (InnoDB), not in the SQL layer — which is why MyISAM tables cannot participate in transactions.

Q4. What happened to the query cache in MySQL 8?

A: The query cache was removed entirely in MySQL 8.0. It suffered from coarse invalidation (any write to a table invalidated all cached results for it) and a global mutex that became a bottleneck on multi-core systems. The recommended replacements are application-level caching (Redis, Caffeine) or ProxySQL query caching.

Q5. How does MySQL handle client connections?

A: MySQL uses a thread-per-connection model: each client connection is served by a dedicated OS thread, with a thread cache to reuse threads and avoid creation overhead. The max_connections variable caps concurrent connections; exceeding it produces "Too many connections" errors. Java applications should always use a connection pool (HikariCP) rather than opening a connection per request.

SHOW VARIABLES LIKE 'max_connections';   -- default 151
SHOW STATUS LIKE 'Threads_connected';    -- current connections
SET GLOBAL max_connections = 500;        -- runtime change

Q6. How does MySQL differ from PostgreSQL?

A: MySQL uses pluggable storage engines and a clustered primary key layout (InnoDB), while PostgreSQL has a single heap-based storage model with all indexes secondary. PostgreSQL offers richer SQL features (full CHECK constraint history, transactional DDL, more advanced types), while MySQL has traditionally been favored for simple read-heavy workloads and easy replication. MySQL's default isolation is REPEATABLE READ; PostgreSQL's is READ COMMITTED.

Q7. How do you connect to MySQL from the CLI and from Java?

A: MySQL listens on port 3306 by default. From the shell you use the mysql client; from Java you use Connector/J with a JDBC URL. Always set the timezone and character set explicitly in production JDBC URLs to avoid conversion surprises.

mysql -h db.example.com -P 3306 -u appuser -p appdb

// Java (Connector/J 8.x)
String url = "jdbc:mysql://db.example.com:3306/appdb"
           + "?useSSL=true&serverTimezone=UTC&characterEncoding=utf8";
Connection conn = DriverManager.getConnection(url, "appuser", "secret");

Q8. What is information_schema?

A: information_schema is a virtual database of read-only views exposing metadata: tables, columns, indexes, constraints, partitions, and privileges. In MySQL 8 it is backed by the transactional data dictionary, making it much faster than in 5.7. It is the standard way to introspect schema programmatically, e.g., from migration tools like Flyway or Liquibase.

SELECT table_name, engine, table_rows
FROM information_schema.tables
WHERE table_schema = 'appdb';

Q9. What is the difference between DDL, DML, DCL, and TCL?

A: DDL (Data Definition Language) defines schema: CREATE, ALTER, DROP, TRUNCATE. DML (Data Manipulation Language) changes data: INSERT, UPDATE, DELETE, and SELECT for retrieval. DCL (Data Control Language) manages privileges: GRANT, REVOKE. TCL (Transaction Control Language) manages transactions: START TRANSACTION, COMMIT, ROLLBACK, SAVEPOINT. In MySQL 8, DDL statements are atomic — they either fully succeed or fully roll back.

Q10. What is sql_mode and why does STRICT_TRANS_TABLES matter?

A: sql_mode controls how strictly MySQL validates data. With STRICT_TRANS_TABLES (default since 5.7), invalid or out-of-range values cause errors instead of silent truncation or zero-substitution. Disabling strict mode is a classic source of data corruption bugs — a 300-character string silently truncated into a VARCHAR(255), for example.

SELECT @@sql_mode;
-- ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,
-- NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION

Q11. What are prepared statements and why should Java code always use them?

A: A prepared statement is parsed and planned once, then executed repeatedly with different bound parameters. Benefits: prevention of SQL injection (parameters are never interpreted as SQL), reduced parse overhead, and driver-side type handling. In JDBC, always use PreparedStatement with placeholders instead of string concatenation.

String sql = "SELECT id, email FROM users WHERE status = ? AND created_at > ?";
try (PreparedStatement ps = conn.prepareStatement(sql)) {
    ps.setString(1, "ACTIVE");
    ps.setTimestamp(2, Timestamp.valueOf(since));
    try (ResultSet rs = ps.executeQuery()) { /* ... */ }
}

Q12. What does the MySQL query optimizer do?

A: The optimizer is cost-based: it evaluates candidate plans (index choices, join orders, access methods) using statistics such as index cardinality and estimated row counts, then picks the cheapest. It also performs rewrites like constant folding, subquery-to-join transformation, and ORDER BY elimination. You can inspect its decision with EXPLAIN and influence it with optimizer hints like /*+ INDEX(t idx_name) */.

Storage Engines: InnoDB vs MyISAM

Q13. What are the key differences between InnoDB and MyISAM?

A: InnoDB supports ACID transactions, row-level locking, foreign keys, MVCC, and crash recovery via redo logs. MyISAM supports none of these: it uses table-level locking, has no transactions, and can corrupt on crash. MyISAM only wins on niche cases like read-only compressed tables. All modern OLTP workloads should use InnoDB.

SHOW ENGINES;   -- InnoDB shows SUPPORT = DEFAULT
SELECT table_name, engine FROM information_schema.tables
WHERE table_schema = 'appdb' AND engine <> 'InnoDB';

Q14. Why has InnoDB been the default engine since MySQL 5.5?

A: Because it is the only bundled engine offering transactional safety and concurrency at scale: row-level locking allows many concurrent writers, MVCC gives non-blocking reads, redo/undo logging gives crash recovery, and foreign keys give referential integrity. MyISAM's table locks serialize writes, which collapses throughput under concurrent load.

Q15. What is a clustered index in InnoDB?

A: In InnoDB, the table is the clustered index: rows are physically stored in the leaf pages of a B+Tree ordered by primary key. A primary key lookup therefore reads the row directly with no extra hop. This also means insert order matters — monotonically increasing keys (AUTO_INCREMENT) append to the right-most page, while random keys (e.g., raw UUIDv4) cause page splits and fragmentation.

Q16. What happens if an InnoDB table has no primary key?

A: InnoDB first looks for the first UNIQUE NOT NULL index and clusters on that. If none exists, it generates a hidden 6-byte row_id as the clustered key. This hidden counter is shared globally across all such tables, which can become a contention point, and row-based replication becomes far slower because replicas must full-scan to find rows. Always define an explicit primary key.

Q17. What is the InnoDB buffer pool?

A: The buffer pool is InnoDB's main memory cache holding data pages and index pages. Reads hit memory when pages are cached; writes are made to pages in the pool (dirty pages) and flushed to disk asynchronously. It uses a modified LRU with young/old sublists to resist full-table-scan pollution. On a dedicated DB server it is typically sized to 60–75% of RAM.

-- my.cnf
innodb_buffer_pool_size = 12G
innodb_buffer_pool_instances = 8

-- hit ratio check
SHOW STATUS LIKE 'Innodb_buffer_pool_read%';

Q18. What is the redo log?

A: The redo log is a write-ahead log (WAL) recording physical page changes before they are flushed to the tablespace. On commit, only the small sequential redo record must be durable — the actual data pages can be flushed later. After a crash, InnoDB replays the redo log to recover committed changes that never reached the data files. It guarantees the D (durability) in ACID.

Q19. What is the undo log?

A: Undo logs store the previous versions of modified rows. They serve two purposes: rolling back uncommitted transactions, and providing old row versions to MVCC so consistent reads can see a snapshot without blocking writers. Long-running transactions prevent undo purge, which bloats undo tablespaces and slows queries — a classic production incident.

Q20. What is the doublewrite buffer?

A: InnoDB pages are 16KB, but OS/disk writes may tear at 4KB boundaries during a crash, leaving a half-written page that redo cannot repair (redo needs an intact page image). The doublewrite buffer solves this by writing pages first to a sequential doublewrite area, then to their final location; on recovery a torn page is restored from the doublewrite copy.

Q21. What is the change buffer?

A: The change buffer caches modifications to secondary index pages that are not currently in the buffer pool, avoiding random reads just to apply small index updates. The buffered changes are merged when the page is later read or by background threads. It only applies to non-unique secondary indexes (uniqueness would require reading the page anyway to check duplicates).

Q22. How do you convert a MyISAM table to InnoDB safely?

A: Use ALTER TABLE ... ENGINE=InnoDB, which rebuilds the table. On large tables run it with an online schema change tool (gh-ost, pt-online-schema-change) to avoid blocking. Verify there are no FULLTEXT features relying on MyISAM behavior (InnoDB has its own FULLTEXT since 5.6) and add a proper primary key first if missing.

ALTER TABLE legacy_orders ENGINE=InnoDB;
-- or online, without blocking writes:
pt-online-schema-change --alter "ENGINE=InnoDB" D=appdb,t=legacy_orders --execute

Data Types

Q23. What is the difference between CHAR and VARCHAR?

A: CHAR(n) is fixed-length: values are space-padded to n characters and storage is constant. VARCHAR(n) is variable-length: it stores the actual characters plus a 1–2 byte length prefix. Use CHAR for truly fixed-width values (country codes, MD5 hex hashes) and VARCHAR for everything else. Note n is characters, not bytes — with utf8mb4 a VARCHAR(255) can take up to 1020 bytes.

CREATE TABLE t (
  country_code CHAR(2),        -- always 2 chars
  email VARCHAR(320) NOT NULL  -- variable length
);

Q24. When should you use VARCHAR vs TEXT?

A: Use VARCHAR when the value has a sensible maximum length, needs a DEFAULT value, or must be fully indexed — VARCHAR participates in the 65,535-byte row limit and is typically stored inline. TEXT types (TEXT, MEDIUMTEXT, LONGTEXT) hold large content, can be stored off-page in overflow pages, cannot have non-expression DEFAULTs, and can only be indexed with a prefix length. For bounded fields like names or emails, VARCHAR is almost always correct.

Q25. What does INT(11) mean, and what is the difference between INT and BIGINT?

A: The number in INT(11) is only a display width hint for ZEROFILL — it does not affect range or storage, and it is deprecated in MySQL 8. INT is 4 bytes (±2.1 billion signed); BIGINT is 8 bytes (±9.2 quintillion). Use BIGINT for surrogate keys on high-volume tables — running out of INT AUTO_INCREMENT range is a real production outage scenario.

Q26. Why should you use DECIMAL instead of FLOAT/DOUBLE for money?

A: FLOAT and DOUBLE are binary IEEE 754 types that cannot exactly represent most decimal fractions, so monetary sums accumulate rounding errors. DECIMAL(p,s) stores exact decimal values and performs exact arithmetic. Pair it with java.math.BigDecimal on the Java side — JDBC maps DECIMAL to BigDecimal natively.

CREATE TABLE payments (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  amount DECIMAL(19,4) NOT NULL   -- exact; maps to BigDecimal in Java
);

Q27. What is the difference between DATETIME and TIMESTAMP?

A: TIMESTAMP is stored as UTC and converted to/from the session time_zone on read/write; its range is 1970–2038 (the 2038 limit is being extended in newer versions, but plan carefully). DATETIME is stored literally with no timezone conversion and ranges from year 1000 to 9999. A common pattern: store business timestamps as DATETIME in UTC enforced by the application, and configure connectionTimeZone=UTC in Connector/J.

Q28. How does AUTO_INCREMENT work, and can it have gaps?

A: Each table has a counter that assigns the next value on insert. Gaps are normal: rolled-back transactions, failed inserts with INSERT ... ON DUPLICATE KEY, and bulk-insert pre-allocation all consume values without producing rows. Never rely on AUTO_INCREMENT being contiguous. Since MySQL 8 the counter is persisted in the redo log, so it no longer resets to MAX(id)+1 after a restart.

INSERT INTO orders (customer_id) VALUES (42);
SELECT LAST_INSERT_ID();   -- per-connection, safe under concurrency

-- JDBC equivalent
PreparedStatement ps = conn.prepareStatement(sql, Statement.RETURN_GENERATED_KEYS);

Q29. What are the pros and cons of the ENUM type?

A: ENUM stores one value from a fixed list as a compact 1–2 byte integer with human-readable output, and it validates input. Cons: adding a value requires ALTER TABLE, ordering/comparison semantics are based on the internal index and surprise people, and portability to other databases is poor. Many teams prefer a VARCHAR with a CHECK constraint (enforced since MySQL 8.0.16) or a lookup table, mapped to a Java enum via JPA @Enumerated(EnumType.STRING).

Q30. How should you store UUIDs in MySQL efficiently?

A: Store them as BINARY(16), not CHAR(36) — half the size in every index. Random UUIDv4 primary keys also scatter inserts across the clustered index, causing page splits; MySQL 8's UUID_TO_BIN(uuid, 1) swaps the time components of a v1 UUID to make values roughly sequential. Alternatively keep an AUTO_INCREMENT BIGINT primary key and put the UUID in a unique secondary column.

CREATE TABLE users (
  id BINARY(16) PRIMARY KEY,
  email VARCHAR(320) NOT NULL UNIQUE
);
INSERT INTO users (id, email) VALUES (UUID_TO_BIN(UUID(), 1), 'a@b.com');
SELECT BIN_TO_UUID(id, 1), email FROM users;

Q31. What is the difference between utf8 and utf8mb4?

A: MySQL's legacy utf8 (alias for utf8mb3) stores at most 3 bytes per character, so it cannot store supplementary characters such as emoji — inserts fail or truncate. utf8mb4 is real UTF-8 with up to 4 bytes per character and is the default charset in MySQL 8. Always use utf8mb4 for new tables and match it in the JDBC connection.

CREATE TABLE messages (
  body TEXT
) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

ALTER TABLE old_table CONVERT TO CHARACTER SET utf8mb4;

Q32. What is the native JSON data type?

A: The JSON type (5.7+) validates documents on insert and stores them in an optimized binary format allowing fast path access without reparsing. You query it with functions like JSON_EXTRACT (or the -> / ->> operators) and update in place with JSON_SET. It is ideal for sparse, schema-flexible attributes attached to otherwise relational rows.

CREATE TABLE products (
  id BIGINT PRIMARY KEY,
  attrs JSON
);
SELECT id, attrs->>'$.color' AS color
FROM products
WHERE JSON_EXTRACT(attrs, '$.stock') > 0;

Indexing & B+Trees

Q33. Why does MySQL use B+Trees for indexes instead of hash tables or binary trees?

A: A B+Tree is a shallow, wide, balanced tree whose nodes match disk pages (16KB), so a lookup in a billion-row table takes only 3–4 page reads. Unlike hash indexes, B+Trees support range scans, prefix matches, and ORDER BY, because leaf pages are sorted and linked in a doubly-linked list. Binary trees are too deep (too many disk seeks) and hash tables only support exact equality.

Q34. How do secondary indexes work in InnoDB?

A: A secondary index leaf stores the indexed column values plus the primary key of the row — not a physical row pointer. A secondary-index lookup therefore does two B+Tree traversals: one on the secondary index to get the PK, then one on the clustered index to fetch the row (a "back to the table" lookup). This is also why a fat primary key inflates every secondary index.

Q35. What is a covering index?

A: A covering index contains every column a query needs, so InnoDB answers the query entirely from the secondary index without touching the clustered index. EXPLAIN shows Using index in the Extra column. It is one of the highest-leverage optimizations for hot read queries.

CREATE INDEX idx_status_created_email
  ON users (status, created_at, email);

-- fully covered: no clustered-index lookup
SELECT email FROM users
WHERE status = 'ACTIVE' AND created_at >= '2026-01-01';

Q36. What is the leftmost prefix rule for composite indexes?

A: An index on (a, b, c) can serve predicates on a, (a, b), or (a, b, c) — but not on b or c alone, because the tree is sorted by a first. A range condition on a column stops index usage for the columns after it. Order composite index columns as: equality columns first, then the range/sort column.

CREATE INDEX idx_abc ON t (a, b, c);
SELECT * FROM t WHERE a = 1 AND b = 2;        -- uses (a, b)
SELECT * FROM t WHERE b = 2;                  -- cannot use idx_abc
SELECT * FROM t WHERE a = 1 AND b > 5 AND c = 3;  -- uses (a, b); c not used for seek

Q37. When will MySQL not use an available index?

A: Common cases: a function or arithmetic applied to the indexed column (WHERE YEAR(created_at) = 2026), a leading wildcard (LIKE '%foo'), an implicit type conversion (comparing a string column to a number, or mismatched collations in a join), OR conditions across different columns, and low selectivity where the optimizer estimates a full scan is cheaper. Rewrite predicates to be "sargable": created_at >= '2026-01-01' AND created_at < '2027-01-01'.

Q38. What is index cardinality and why does it matter?

A: Cardinality is the estimated number of distinct values in an index; selectivity is cardinality divided by row count. High-selectivity indexes (email, user_id) filter rows efficiently; low-selectivity ones (gender, boolean flags) rarely help alone. The optimizer relies on these statistics, so stale stats can cause bad plans.

SHOW INDEX FROM orders;      -- see Cardinality column
ANALYZE TABLE orders;        -- refresh statistics

Q39. What is a prefix index?

A: A prefix index indexes only the first N characters of a string column, saving space and making the tree shallower. It is required for TEXT/BLOB columns. Trade-offs: it cannot be a covering index and cannot fully optimize ORDER BY/GROUP BY. Choose N so prefix selectivity approaches full-column selectivity.

-- find a good prefix length
SELECT COUNT(DISTINCT LEFT(url, 20)) / COUNT(DISTINCT url) FROM pages;
CREATE INDEX idx_url ON pages (url(20));

Q40. How does FULLTEXT search work in InnoDB?

A: InnoDB FULLTEXT indexes (5.6+) build an inverted index of words to documents and are queried with MATCH ... AGAINST in natural language or boolean mode. They respect innodb_ft_min_token_size and stopword lists. For heavy search workloads, teams typically move to Elasticsearch/OpenSearch, but FULLTEXT is fine for moderate needs.

CREATE FULLTEXT INDEX idx_ft ON articles (title, body);
SELECT id, title FROM articles
WHERE MATCH(title, body) AGAINST ('+mysql -oracle' IN BOOLEAN MODE);

Q41. Why can too many indexes hurt performance?

A: Every INSERT/UPDATE/DELETE must maintain every index, so each extra index adds write amplification, more page splits, more buffer pool pressure, and longer DDL. Too many similar indexes can also confuse the optimizer into picking a worse one. Audit periodically and drop unused or redundant indexes (an index on (a) is redundant if (a, b) exists).

Q42. What are descending indexes in MySQL 8?

A: Before 8.0, DESC in an index definition was parsed but ignored; MySQL 8 actually stores descending key order. This matters for mixed-direction sorts like ORDER BY created_at DESC, id ASC, which previously forced a filesort — a matching mixed-order index now serves it directly.

CREATE INDEX idx_feed ON posts (created_at DESC, id ASC);
SELECT * FROM posts ORDER BY created_at DESC, id ASC LIMIT 20;  -- no filesort

Q43. What are functional indexes?

A: Since 8.0.13 you can index an expression, letting queries that filter on that expression use an index instead of a full scan. Internally it is implemented as a hidden virtual generated column. The query must use the exact same expression to match.

CREATE INDEX idx_email_lower ON users ((LOWER(email)));
SELECT * FROM users WHERE LOWER(email) = 'raj@example.com';  -- uses index

Q44. How do you find unused or duplicate indexes?

A: The sys schema (built on performance_schema) tracks index usage since server start. Combine schema_unused_indexes with schema_redundant_indexes, observe over a representative period (include month-end batch jobs!), and drop candidates — using ALTER TABLE ... ALTER INDEX ... INVISIBLE first as a safe trial.

SELECT * FROM sys.schema_unused_indexes;
SELECT * FROM sys.schema_redundant_indexes;

Transactions & Isolation Levels

Q45. How does InnoDB implement each ACID property?

A: Atomicity: undo logs allow rolling back partial changes. Consistency: constraints (PK, FK, CHECK) plus the combination of the other three properties keep data valid. Isolation: MVCC snapshots and locking control what concurrent transactions see. Durability: the redo log is flushed on commit (with innodb_flush_log_at_trx_commit=1), so committed changes survive a crash.

Q46. What are the four isolation levels and what anomalies do they allow?

A: READ UNCOMMITTED: allows dirty reads, non-repeatable reads, phantoms. READ COMMITTED: prevents dirty reads; allows non-repeatable reads and phantoms. REPEATABLE READ (InnoDB default): prevents dirty and non-repeatable reads; InnoDB additionally blocks most phantoms via next-key locking and snapshot reads. SERIALIZABLE: prevents all anomalies by making plain SELECTs acquire shared locks — highest safety, lowest concurrency.

Q47. What is MySQL's default isolation level and how do you change it?

A: InnoDB defaults to REPEATABLE READ — unlike Oracle, PostgreSQL, and SQL Server, which default to READ COMMITTED. Many high-throughput shops deliberately switch to READ COMMITTED to reduce gap locking and deadlocks. It can be set globally, per session, or per transaction (and from JDBC via Connection.setTransactionIsolation()).

SELECT @@transaction_isolation;              -- REPEATABLE-READ
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

// JDBC
conn.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED);

Q48. Explain dirty read, non-repeatable read, and phantom read.

A: A dirty read reads data another transaction has written but not yet committed — if it rolls back, you acted on data that never existed. A non-repeatable read is when re-reading the same row within one transaction returns a different value because another transaction committed an update in between. A phantom read is when re-running the same range query returns new rows inserted by another committed transaction.

Q49. What is MVCC and how does InnoDB implement it?

A: Multi-Version Concurrency Control lets readers see a consistent snapshot without blocking writers. Each row carries hidden fields (transaction id and a roll pointer into the undo log chain); a transaction's read view defines which versions are visible, and older versions are reconstructed by walking the undo chain. Result: plain SELECTs take no row locks at READ COMMITTED / REPEATABLE READ.

Q50. How does InnoDB's REPEATABLE READ handle phantom reads?

A: For consistent (non-locking) reads, the transaction reuses the snapshot created at its first read, so re-executed range queries return the same rows — no phantoms. For locking reads (SELECT ... FOR UPDATE, UPDATE, DELETE), InnoDB takes next-key locks that lock both the matching index records and the gaps between them, preventing other transactions from inserting phantoms into the scanned range. Mixing snapshot reads and current reads in one transaction can still produce surprising results (write skew), which interviewers love to probe.

Q51. What is the difference between a consistent read and a locking read?

A: A consistent read (plain SELECT) reads from the MVCC snapshot and takes no locks — it may return slightly stale data relative to concurrent commits. A locking (current) read reads the latest committed version and locks it: FOR SHARE takes shared locks, FOR UPDATE takes exclusive locks. Use locking reads for read-modify-write sequences like inventory decrement.

START TRANSACTION;
SELECT stock FROM inventory WHERE product_id = 7 FOR UPDATE;  -- locks row
UPDATE inventory SET stock = stock - 1 WHERE product_id = 7;
COMMIT;

-- MySQL 8: skip contended rows in job queues
SELECT * FROM jobs WHERE status='NEW' LIMIT 1 FOR UPDATE SKIP LOCKED;

Q52. How does autocommit work in MySQL?

A: With autocommit=1 (the default), every statement is its own transaction and commits immediately. START TRANSACTION (or JDBC conn.setAutoCommit(false)) suspends it until COMMIT/ROLLBACK. A frequent Java bug is disabling autocommit in a pool and forgetting to commit — changes silently roll back when the connection is returned, or worse, idle transactions hold locks and block undo purge.

conn.setAutoCommit(false);
try {
    debit(conn, fromAcct, amount);
    credit(conn, toAcct, amount);
    conn.commit();
} catch (SQLException e) {
    conn.rollback();
    throw e;
}

Q53. What is a savepoint?

A: A savepoint is a named marker inside a transaction that you can roll back to without aborting the whole transaction — useful for retrying one step of a multi-step unit of work. JDBC exposes it via Connection.setSavepoint(), and Spring's PROPAGATION_NESTED is implemented with savepoints.

START TRANSACTION;
INSERT INTO orders (...) VALUES (...);
SAVEPOINT after_order;
INSERT INTO loyalty_points (...) VALUES (...);   -- optional step
ROLLBACK TO SAVEPOINT after_order;               -- undo only that step
COMMIT;

Q54. What happens internally when a transaction rolls back?

A: InnoDB walks the transaction's undo log records in reverse and applies the inverse of each change, restoring prior row versions and removing inserted rows. Locks are released once rollback finishes. Because rollback replays every change one by one, rolling back a huge transaction can take as long as (or longer than) making the changes — another reason to keep transactions small.

Q55. How do you manage MySQL transactions in Spring?

A: Annotate service methods with @Transactional; Spring begins a transaction on entry, commits on normal return, and rolls back on unchecked exceptions by default (checked exceptions need rollbackFor). Key gotchas: the annotation only works through proxy calls (self-invocation bypasses it), and long methods holding transactions open cause lock contention and connection pool exhaustion.

@Service
public class TransferService {
    @Transactional(isolation = Isolation.READ_COMMITTED, timeout = 5)
    public void transfer(long from, long to, BigDecimal amount) {
        accountRepo.debit(from, amount);
        accountRepo.credit(to, amount);
    }   // commit on return, rollback on RuntimeException
}

Locking: Row, Gap & Next-Key

Q56. What is the difference between row-level and table-level locking?

A: Table-level locking (MyISAM, explicit LOCK TABLES) locks the entire table, so one writer blocks all other readers/writers — cheap to manage but terrible for concurrency. Row-level locking (InnoDB) locks only the index records a statement touches, allowing thousands of concurrent transactions on different rows at the cost of more lock-management overhead and possible deadlocks.

Q57. What are shared (S) and exclusive (X) locks?

A: A shared lock allows other transactions to also take shared locks on the same row (many concurrent readers) but blocks exclusive locks. An exclusive lock blocks both other X and S locks — required to modify a row. UPDATE/DELETE take X locks implicitly; SELECT can take them explicitly.

SELECT * FROM accounts WHERE id = 1 FOR SHARE;    -- S lock
SELECT * FROM accounts WHERE id = 1 FOR UPDATE;   -- X lock
-- view current lock waits
SELECT * FROM performance_schema.data_lock_waits;

Q58. Explain record locks, gap locks, and next-key locks.

A: A record lock locks a single index record. A gap lock locks the open interval between index records (or before the first / after the last), preventing inserts into that gap while locking nothing that exists. A next-key lock is the combination: the record plus the gap before it. InnoDB uses next-key locks for range scans at REPEATABLE READ to stop phantom inserts.

Q59. When does InnoDB use gap locks, and how can you reduce them?

A: Gap locks appear during locking reads, UPDATEs, and DELETEs that scan a range or use a non-unique index at REPEATABLE READ; a unique-index equality match on an existing row needs only a record lock. To reduce gap locking: switch to READ COMMITTED (gap locks are mostly disabled there except for FK and duplicate checks), always filter by unique indexes for point updates, and keep proper indexes so statements do not scan (an unindexed UPDATE locks every scanned row).

Q60. What are intention locks?

A: Intention locks are table-level flags (IS/IX) a transaction sets before taking row-level S/X locks, announcing "I hold or will hold row locks in this table." They let a table-level request (like LOCK TABLES ... WRITE or some DDL) detect conflicts with row locks instantly instead of scanning every row. Intention locks never conflict with each other.

Q61. What is a deadlock and how does InnoDB handle it?

A: A deadlock is a cycle where transactions each hold locks the others need — e.g., T1 locks row A then wants B, while T2 locks B then wants A. InnoDB detects the cycle immediately and kills the transaction with the smallest rollback footprint, returning error 1213. Applications should treat 1213 as retryable and rerun the transaction.

SHOW ENGINE INNODB STATUS;      -- LATEST DETECTED DEADLOCK section
SET GLOBAL innodb_print_all_deadlocks = ON;   -- log every deadlock

// Java: retry on deadlock
catch (SQLTransactionRollbackException e) { /* errorCode 1213 → retry */ }

Q62. How do you minimize deadlocks in a Java application?

A: Access rows and tables in a consistent global order (e.g., always lock the lower account id first); keep transactions short and never hold them across remote calls or user think-time; index your WHERE clauses so statements lock fewer rows; consider READ COMMITTED to cut gap-lock deadlocks; and implement automatic retry with backoff for error 1213 since occasional deadlocks are unavoidable.

Q63. What is innodb_lock_wait_timeout?

A: It is how long (default 50 seconds) a transaction waits for a row lock before failing with error 1205 "Lock wait timeout exceeded". Unlike a deadlock, only the waiting statement gives up — the transaction is not rolled back by default (unless innodb_rollback_on_timeout=ON), which can leave a half-done transaction holding locks. Web applications usually lower it to a few seconds and fail fast.

Q64. What are metadata locks (MDL)?

A: Every statement takes a metadata lock on the tables it touches, held until the transaction ends, so a table cannot be ALTERed while queries use it. A classic incident: a long-running transaction holds an MDL, an ALTER TABLE queues behind it waiting for exclusive MDL, and then every new query on that table queues behind the ALTER — the whole app appears frozen. Diagnose with performance_schema.metadata_locks and keep transactions short.

Replication & Binlog

Q65. What is the binary log and what formats does it support?

A: The binlog records all data-changing events in commit order and powers replication, point-in-time recovery, and CDC tools like Debezium. Formats: STATEMENT (logs SQL text — compact but unsafe for non-deterministic statements), ROW (logs actual row changes — safe, the default since 5.7), and MIXED (statement, switching to row when unsafe).

SHOW VARIABLES LIKE 'binlog_format';   -- ROW
SHOW BINARY LOGS;
SHOW BINLOG EVENTS IN 'binlog.000042' LIMIT 10;

Q66. How does MySQL asynchronous replication work?

A: The source writes committed changes to its binlog. On each replica, an I/O (receiver) thread connects to the source, streams binlog events, and appends them to a local relay log; SQL (applier) threads then execute the relay-log events. Because it is asynchronous, the source does not wait for replicas — replicas can lag, and a source crash can lose transactions not yet shipped.

Q67. What is semi-synchronous replication?

A: With semi-sync, a commit on the source blocks until at least rpl_semi_sync_source_wait_for_replica_count replicas acknowledge receiving (not applying) the transaction in their relay logs. This bounds data loss on source failure to near zero at the cost of added commit latency, and it degrades to async if no replica responds within the timeout. Group Replication / InnoDB Cluster is the stronger consistency option.

Q68. What is GTID-based replication?

A: A GTID (Global Transaction Identifier, server_uuid:sequence) uniquely tags every transaction across the topology. Replicas track the set of executed GTIDs, so failover and re-pointing a replica no longer require hunting for binlog file/position coordinates — CHANGE REPLICATION SOURCE TO ... SOURCE_AUTO_POSITION=1 just works. It also lets tools detect errant transactions on replicas.

-- my.cnf
gtid_mode = ON
enforce_gtid_consistency = ON

CHANGE REPLICATION SOURCE TO
  SOURCE_HOST='primary.db', SOURCE_AUTO_POSITION=1;
START REPLICA;

Q69. What is replication lag and how do you detect and reduce it?

A: Lag is the delay between a commit on the source and its application on a replica; reading a lagging replica returns stale data (a user saves a profile, reloads, and sees the old value). Detect it with SHOW REPLICA STATUS (Seconds_Behind_Source) or better, a heartbeat table. Reduce it with multi-threaded appliers (replica_parallel_workers with WRITESET dependency tracking), smaller transactions, and faster replica storage.

SHOW REPLICA STATUS\G   -- Seconds_Behind_Source, Retrieved/Executed_Gtid_Set

-- my.cnf on replica
replica_parallel_workers = 8
replica_preserve_commit_order = ON

Q70. What is the difference between the redo log and the binlog?

A: The redo log is an InnoDB-internal, physical, circular log used only for crash recovery of the local instance. The binlog is a server-level, logical, append-only log of committed changes used for replication and PITR. A commit writes both, coordinated by an internal two-phase commit (XA) so the two logs never disagree about whether a transaction committed.

Q71. How does point-in-time recovery (PITR) work?

A: Restore the most recent full backup (mysqldump, MySQL Shell dump, or Percona XtraBackup), then replay binlogs from the backup's coordinates up to just before the bad event (e.g., an accidental DELETE), using --stop-datetime or a GTID/position boundary. This is why binlog retention must exceed your backup interval.

mysqlbinlog --start-position=4 \
  --stop-datetime="2026-07-03 09:00:00" \
  binlog.000041 binlog.000042 | mysql -u root -p

Q72. How do Java applications implement read/write splitting with MySQL replicas?

A: Options: Spring's AbstractRoutingDataSource routing @Transactional(readOnly=true) calls to a replica pool; a proxy layer (ProxySQL, MySQL Router) that routes by statement; or Connector/J's replication topology support. The hard part is read-your-own-writes consistency — route post-write reads to the primary, use session consistency via GTID wait (WAIT_FOR_EXECUTED_GTID_SET), or accept staleness explicitly per use case.

Q73. What do sync_binlog and innodb_flush_log_at_trx_commit control?

A: innodb_flush_log_at_trx_commit controls redo-log flushing: 1 = fsync every commit (full durability), 2 = write to OS cache per commit and fsync each second (can lose ~1s on OS crash), 0 = flush once per second (fastest, least safe). sync_binlog is the same idea for the binlog: 1 = fsync per commit. The fully durable "double 1" setting (1/1) is required for lossless failover; relaxing them trades durability for write throughput.

-- my.cnf — fully durable configuration
innodb_flush_log_at_trx_commit = 1
sync_binlog = 1

Partitioning

Q74. What is table partitioning and what types does MySQL support?

A: Partitioning splits one logical table into multiple physical segments based on a column expression, transparently to queries. MySQL supports RANGE (value ranges, typically dates), LIST (explicit value sets), HASH (modulo of an expression for even spread), and KEY (server-provided hashing), plus RANGE/LIST COLUMNS variants and subpartitioning. Only InnoDB-native partitioning is supported in MySQL 8.

Q75. Show a typical RANGE partitioning scheme for time-series data.

A: RANGE partitioning by month or day is the standard pattern for logs, events, and audit tables — new partitions are pre-created ahead of time, and old ones are dropped cheaply. Always include a MAXVALUE catch-all partition (or automate partition creation) so inserts beyond the last defined range do not fail.

CREATE TABLE events (
  id BIGINT AUTO_INCREMENT,
  created_at DATETIME NOT NULL,
  payload JSON,
  PRIMARY KEY (id, created_at)          -- PK must include partition column
)
PARTITION BY RANGE (TO_DAYS(created_at)) (
  PARTITION p2026_06 VALUES LESS THAN (TO_DAYS('2026-07-01')),
  PARTITION p2026_07 VALUES LESS THAN (TO_DAYS('2026-08-01')),
  PARTITION pmax     VALUES LESS THAN MAXVALUE
);

Q76. What is partition pruning?

A: Pruning is the optimizer skipping partitions that cannot contain matching rows, based on the WHERE clause — a query on one day of data touches one partition instead of the whole table. It only works when the predicate references the partitioning expression's column in a way the optimizer can analyze. Verify with EXPLAIN's partitions column.

EXPLAIN SELECT * FROM events
WHERE created_at >= '2026-07-01' AND created_at < '2026-07-02';
-- partitions: p2026_07   (only one partition scanned)

Q77. What are the main limitations of MySQL partitioning?

A: Every UNIQUE key (including the primary key) must include all partitioning columns; foreign keys are not supported on partitioned tables; there is no global index across partitions, so queries that do not prune must open every partition; and the partitioning expression is limited to certain functions on integer/date columns. These constraints are why partitioning is mostly used for time-based retention rather than general performance.

Q78. What is the difference between partitioning and sharding?

A: Partitioning splits data within one server — one instance, one buffer pool, invisible to the application. Sharding splits data across multiple servers, each holding a subset keyed by a shard key (e.g., user_id ranges or hashes); the application or a middleware layer (Vitess, ShardingSphere) must route queries, and cross-shard joins/transactions become application problems. Partitioning helps with data lifecycle; sharding is for scaling writes and total data volume beyond one machine.

Q79. How do partitions make deleting old data fast?

A: DROP PARTITION removes a whole segment as a file-level metadata operation — milliseconds and minimal undo/redo, versus a huge DELETE WHERE created_at < ... that generates massive undo, binlog volume, and replication lag on millions of rows. This retention pattern is the single strongest reason to partition large append-only tables.

ALTER TABLE events DROP PARTITION p2026_06;        -- instant retention
ALTER TABLE events ADD PARTITION (
  PARTITION p2026_08 VALUES LESS THAN (TO_DAYS('2026-09-01'))
);

Performance Tuning & EXPLAIN

Q80. How do you read EXPLAIN output?

A: Focus on: type (access method), key (index actually chosen), rows (estimated rows examined), filtered (percentage surviving remaining conditions), and Extra (Using index, Using filesort, Using temporary). A healthy OLTP query uses a specific key, examines rows proportional to what it returns, and avoids filesort/temporary on large sets.

EXPLAIN SELECT o.id, o.total
FROM orders o
WHERE o.customer_id = 42 AND o.status = 'SHIPPED';
-- type: ref | key: idx_customer_status | rows: 12 | Extra: Using index condition

Q81. What are the EXPLAIN access types, from best to worst?

A: Roughly: system/const (single row by PK/unique constant), eq_ref (one row per join via unique index), ref (non-unique index equality), range (index range scan), index (full scan of the index tree), and ALL (full table scan). ALL or index on a large table in a hot path is the first thing to fix — usually by adding or fixing an index.

Q82. What does EXPLAIN ANALYZE add over EXPLAIN?

A: EXPLAIN ANALYZE (8.0.18+) actually executes the query and reports the real iterator tree with actual times, actual row counts, and loop counts — exposing where estimates were wrong. It is the fastest way to find the truly expensive step in a complex join or subquery. Remember it runs the query, so be careful with slow statements on production (it discards the result set, and you should not use it on destructive DML).

EXPLAIN ANALYZE
SELECT c.name, SUM(o.total)
FROM customers c JOIN orders o ON o.customer_id = c.id
GROUP BY c.name;
-- -> Aggregate ... (actual time=..., rows=..., loops=1)

Q83. What do "Using filesort" and "Using temporary" mean in EXPLAIN?

A: Using filesort means MySQL must sort result rows itself (in the sort buffer, spilling to disk if large) because no index delivers the required order — despite the name it is not always on disk. Using temporary means an intermediate temporary table is needed, typically for GROUP BY/DISTINCT on non-indexed expressions. Both are fine on small sets but should be eliminated on large hot queries by aligning indexes with the ORDER BY/GROUP BY.

Q84. What is the slow query log and how do you use it?

A: The slow query log records statements exceeding long_query_time (and optionally those not using indexes). It is the primary tool for finding what to optimize. Aggregate it with pt-query-digest to rank queries by total time consumed — the top three usually account for most of the load.

-- my.cnf
slow_query_log = ON
long_query_time = 0.2
log_queries_not_using_indexes = ON

pt-query-digest /var/lib/mysql/host-slow.log | head -50

Q85. Why is OFFSET pagination slow and what is the alternative?

A: LIMIT 20 OFFSET 100000 forces MySQL to generate and discard 100,000 rows before returning 20 — cost grows linearly with page depth. Keyset (seek) pagination instead filters on the last-seen key, so every page is an index seek of constant cost. The trade-off is that you cannot jump to an arbitrary page number.

-- slow at depth
SELECT id, title FROM posts ORDER BY id DESC LIMIT 20 OFFSET 100000;

-- keyset pagination: constant time
SELECT id, title FROM posts
WHERE id < :lastSeenId
ORDER BY id DESC LIMIT 20;

Q86. Why is SELECT * considered harmful in production code?

A: It fetches columns you do not need (wasting network, memory, and JDBC deserialization time, especially with TEXT/BLOB columns), it defeats covering indexes because the query needs every column, and it couples code to schema — adding a column can silently change result sets, break INSERT ... SELECT, or inflate ORM entities. Select explicit columns, or use projections/DTOs in JPA.

Q87. Why is COUNT(*) slow on large InnoDB tables and what can you do?

A: Because of MVCC, different transactions can see different row counts, so InnoDB cannot keep one authoritative counter (unlike MyISAM) — COUNT(*) must scan an index. Mitigations: let it scan the smallest secondary index (the optimizer usually does), keep a summary/counter table updated transactionally or asynchronously, use information_schema.tables.table_rows when an estimate suffices, or move analytics counts to a warehouse. Note COUNT(*) and COUNT(1) are equivalent; COUNT(col) skips NULLs.

Q88. Which server parameters matter most for InnoDB performance?

A: The big ones: innodb_buffer_pool_size (the single most important — cache your working set), redo log capacity via innodb_redo_log_capacity (bigger smooths write bursts), innodb_flush_log_at_trx_commit and sync_binlog (durability vs write latency), max_connections plus thread cache, and innodb_io_capacity tuned to your storage. Most other knobs are second-order; measure before tuning.

-- my.cnf (32GB dedicated DB host, SSD)
innodb_buffer_pool_size = 24G
innodb_redo_log_capacity = 4G
innodb_io_capacity = 2000
innodb_flush_log_at_trx_commit = 1
max_connections = 500

Q89. Why do Java services use a connection pool, and how should it be sized?

A: Creating a MySQL connection costs a TCP+TLS handshake and authentication (tens of milliseconds), and each server connection consumes a thread and memory — pooling amortizes this and caps concurrency. Pools should be small: throughput usually peaks near 2 × cores of the DB host, not hundreds; oversized pools just queue inside MySQL instead of inside the app. HikariCP is the de facto standard in Spring Boot.

# application.yml
spring:
  datasource:
    hikari:
      maximum-pool-size: 20
      minimum-idle: 5
      connection-timeout: 3000
      max-lifetime: 1800000   # < MySQL wait_timeout

Q90. Production is suddenly slow — how do you diagnose MySQL in the moment?

A: First look at what is running now: SHOW PROCESSLIST (or sys.session) for pile-ups and lock waits; check performance_schema.data_lock_waits and sys.innodb_lock_waits for blocking transactions; check SHOW ENGINE INNODB STATUS for deadlocks and history list length (undo backlog from a long transaction); confirm replication lag and disk saturation. The cause is usually one bad new query, a lock-holding long transaction, or a missing index after a deploy.

SELECT * FROM sys.session WHERE command <> 'Sleep' ORDER BY time DESC;
SELECT * FROM sys.innodb_lock_waits;   -- who blocks whom, with kill statements
SHOW ENGINE INNODB STATUS\G

MySQL 8 Features & Java/JDBC Integration

Q91. What are CTEs and recursive CTEs in MySQL 8?

A: A Common Table Expression (WITH name AS (...)) is a named subquery that makes complex queries readable and can be referenced multiple times. A recursive CTE references itself, enabling hierarchy traversal (org charts, category trees) and sequence generation — previously impossible in plain MySQL without stored procedures.

WITH RECURSIVE subordinates AS (
  SELECT id, name, manager_id FROM employees WHERE id = 1
  UNION ALL
  SELECT e.id, e.name, e.manager_id
  FROM employees e
  JOIN subordinates s ON e.manager_id = s.id
)
SELECT * FROM subordinates;

Q92. What are window functions and when do you use them?

A: Window functions (8.0+) compute values over a set of related rows without collapsing them like GROUP BY: ROW_NUMBER(), RANK(), LAG()/LEAD(), and aggregates with OVER (PARTITION BY ... ORDER BY ...). Classic interview use case: "top N per group" — latest order per customer, highest salary per department.

SELECT * FROM (
  SELECT o.*,
         ROW_NUMBER() OVER (PARTITION BY customer_id
                            ORDER BY created_at DESC) AS rn
  FROM orders o
) t WHERE rn = 1;   -- latest order per customer

Q93. What are invisible indexes and what are they for?

A: An invisible index is fully maintained on writes but ignored by the optimizer. It is the safe way to test dropping an index: make it invisible, watch query performance for a few days (covering batch cycles), and only then drop it — flipping it back visible is instant, whereas rebuilding a dropped index on a huge table is not. Also useful in reverse: create a new index invisible, then enable it during a low-traffic window.

ALTER TABLE orders ALTER INDEX idx_legacy INVISIBLE;
-- regression? flip back instantly:
ALTER TABLE orders ALTER INDEX idx_legacy VISIBLE;
-- confident? drop it:
DROP INDEX idx_legacy ON orders;

Q94. How do you index data inside a JSON column?

A: Two ways: create a (possibly stored) generated column extracting the JSON path and index that, or use a MySQL 8.0.17+ multi-valued index with CAST(... AS ... ARRAY) for JSON arrays queried via MEMBER OF/JSON_CONTAINS. Without one of these, JSON path predicates always full-scan.

ALTER TABLE products
  ADD COLUMN brand VARCHAR(64)
    GENERATED ALWAYS AS (attrs->>'$.brand') STORED,
  ADD INDEX idx_brand (brand);

-- multi-valued index on a JSON array
CREATE INDEX idx_tags ON products ((CAST(attrs->'$.tags' AS CHAR(32) ARRAY)));
SELECT * FROM products WHERE 'sale' MEMBER OF (attrs->'$.tags');

Q95. What other notable features arrived in MySQL 8?

A: Highlights beyond CTEs/window functions: a transactional data dictionary and atomic DDL; roles for grouping privileges; utf8mb4 as default charset; instant ADD COLUMN (metadata-only ALTER); persisted AUTO_INCREMENT; caching_sha2_password as the default auth plugin; SELECT ... FOR UPDATE SKIP LOCKED / NOWAIT; hash joins (8.0.18+); and the removal of the query cache.

Q96. How do you configure JDBC for MySQL 8 correctly?

A: Use Connector/J 8.x with driver class com.mysql.cj.jdbc.Driver (auto-registered via SPI, no Class.forName needed). Set timezone and charset explicitly, enable TLS, and align pool max-lifetime below the server's wait_timeout. With the default caching_sha2_password plugin over non-TLS connections you may need allowPublicKeyRetrieval=true in dev.

jdbc:mysql://db:3306/appdb?connectionTimeZone=UTC&characterEncoding=utf8&sslMode=REQUIRED&rewriteBatchedStatements=true&cachePrepStmts=true&useServerPrepStmts=true&prepStmtCacheSize=250&prepStmtCacheSqlLimit=2048

Q97. What does rewriteBatchedStatements do?

A: By default, a JDBC batch of N inserts is still sent as N round-trips. With rewriteBatchedStatements=true, Connector/J rewrites them into one multi-row INSERT ... VALUES (...),(...),... statement — commonly a 10–50x speedup for bulk loads. Keep the rewritten packet under max_allowed_packet, and note that per-row generated-key retrieval semantics can differ.

try (PreparedStatement ps = conn.prepareStatement(
        "INSERT INTO metrics (name, val) VALUES (?, ?)")) {
    for (Metric m : metrics) {
        ps.setString(1, m.name());
        ps.setDouble(2, m.value());
        ps.addBatch();
    }
    ps.executeBatch();   // one multi-row INSERT on the wire
}

Q98. How do you stream a huge result set from MySQL in Java without OutOfMemoryError?

A: Connector/J buffers the entire result set in memory by default. For large exports, enable streaming: stmt.setFetchSize(Integer.MIN_VALUE) (row-by-row streaming) or set useCursorFetch=true with a positive fetch size for server-side cursors. While streaming, the connection is busy until the result set is fully read or closed, so do not reuse it concurrently.

PreparedStatement ps = conn.prepareStatement(
    "SELECT * FROM events",
    ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
ps.setFetchSize(Integer.MIN_VALUE);   // stream rows, don't buffer all
try (ResultSet rs = ps.executeQuery()) {
    while (rs.next()) process(rs);
}

Q99. What do useServerPrepStmts and cachePrepStmts control?

A: By default Connector/J uses client-side prepared statements (it interpolates parameters and sends full SQL text). useServerPrepStmts=true switches to true server-side prepares (parse once, execute many via the binary protocol), and cachePrepStmts=true with prepStmtCacheSize/prepStmtCacheSqlLimit caches prepared statements per connection so repeated queries skip re-preparing. HikariCP's MySQL guidance recommends enabling all of these for hot OLTP paths.

Q100. What MySQL-related production issues most often bite Java services, and how do you prevent them?

A: The classics: connection exhaustion ("Too many connections" — size pools deliberately, fix connection leaks with leak detection); stale pooled connections killed by wait_timeout (set Hikari max-lifetime lower and use JDBC4 validation); lock wait timeouts and deadlocks from long transactions spanning remote calls (keep transactions short, retry 1213/1205); timezone and utf8mb4 mismatches corrupting data (pin UTC and utf8mb4 end-to-end); and packet errors on large writes (raise max_allowed_packet). Every one of these is a configuration-and-discipline problem, not a MySQL bug — which is exactly the point to make in an interview.