Chapter 16 The debugger (ocamldebug)

Chapter 16 The debugger (ocamldebug)

This chapter describes the OCaml source-level replay debuggerocamldebug.

Unix: The debugger is available on Unix systems that provideBSD sockets.

Windows: The debugger is available under the Cygwin port ofOCaml, but not under the native Win32 ports.

16.1 Compiling for debugging

Before the debugger can be used, the program must be compiled andlinked with the -g option: all .cmo and .cma files that are partof the program should have been created with ocamlc -g, and theymust be linked together with ocamlc -g.

Compiling with -g entails no penalty on the running time ofprograms: object files and bytecode executable files are bigger andtake longer to produce, but the executable files run atexactly the same speed as if they had been compiled without -g.

16.2 Invocation

16.2.1 Starting the debugger

The OCaml debugger is invoked by running the programocamldebug with the name of the bytecode executable file as firstargument:

        ocamldebug [options] program [arguments]

The arguments following program are optional, and are passed ascommand-line arguments to the program being debugged. (See also theset arguments command.)

The following command-line options are recognized:

-c count
Set the maximum number of simultaneously live checkpoints to count.
-cd dir
Run the debugger program from the working directory dir,instead of the current directory. (See also the cd command.)
-emacs
Tell the debugger it is executed under Emacs. (Seesection 16.10 for information on how to run thedebugger under Emacs.)
-I directory
Add directory to the list of directories searched for sourcefiles and compiled files. (See also the directory command.)
-s socket
Use socket for communicating with the debugged program. See thedescription of the command set socket (section 16.8.6)for the format of socket.
-version
Print version string and exit.
-vnum
Print short version number and exit.
-help or —help
Display a short usage summary and exit.

16.2.2 Initialization file

On start-up, the debugger will read commands from an initializationfile before giving control to the user. The default file is.ocamldebug in the current directory if it exists, otherwise.ocamldebug in the user’s home directory.

16.2.3 Exiting the debugger

The command quit exits the debugger. You can also exit the debuggerby typing an end-of-file character (usually ctrl-D).

Typing an interrupt character (usually ctrl-C) will not exit thedebugger, but will terminate the action of any debugger command that is inprogress and return to the debugger command level.

16.3 Commands

A debugger command is a single line of input. It starts with a commandname, which is followed by arguments depending on this name. Examples:

        run
        goto 1000
        set arguments arg1 arg2

A command name can be truncated as long as there is no ambiguity. Forinstance, go 1000 is understood as goto 1000, since there are noother commands whose name starts with go. For the most frequentlyused commands, ambiguous abbreviations are allowed. For instance, rstands for run even though there are others commands starting withr. You can test the validity of an abbreviation using the help command.

If the previous command has been successful, a blank line (typing justRET) will repeat it.

16.3.1 Getting help

The OCaml debugger has a simple on-line help system, which givesa brief description of each command and variable.

help
Print the list of commands.
help command
Give help about the command command.
help set variable, help show variable
Give help about the variable variable. The list of all debuggervariables can be obtained with help set.
help info topic
Give help about topic. Use help info to get a list of known topics.

16.3.2 Accessing the debugger state

set variable value
Set the debugger variable variable to the value value.
show variable
Print the value of the debugger variable variable.
info subject
Give information about the given subject.For instance, info breakpoints will print the list of all breakpoints.

16.4 Executing a program

16.4.1 Events

Events are “interesting” locations in the source code, correspondingto the beginning or end of evaluation of “interesting”sub-expressions. Events are the unit of single-stepping (stepping goesto the next or previous event encountered in the program execution).Also, breakpoints can only be set at events. Thus, events play therole of line numbers in debuggers for conventional languages.

During program execution, a counter is incremented at each eventencountered. The value of this counter is referred as the currenttime. Thanks to reverse execution, it is possible to jump back andforth to any time of the execution.

Here is where the debugger events (written ǧ) are located inthe source code:

Following a function application:

(f arg)ǧ

On entrance to a function:

fun x y z -> ǧ ...

On each case of a pattern-matching definition (function,match…with construct, try…with construct):

function pat1 -> ǧ expr1
       | ...
       | patN -> ǧ exprN

Between subexpressions of a sequence:

expr1; ǧ expr2; ǧ ...; ǧ exprN

In the two branches of a conditional expression:

if cond then ǧ expr1 else ǧ expr2

At the beginning of each iteration of a loop:

while cond do ǧ body done
for i = a to b do ǧ body done

Exceptions: A function application followed by a function return is replacedby the compiler by a jump (tail-call optimization). In this case, noevent is put after the function application.

16.4.2 Starting the debugged program

The debugger starts executing the debugged program only when needed.This allows setting breakpoints or assigning debugger variables beforeexecution starts. There are several ways to start execution:

run
Run the program until a breakpoint is hit, or the programterminates.
goto 0
Load the program and stop on the first event.
goto time
Load the program and execute it until thegiven time. Useful when you already know approximately at what timethe problem appears. Also useful to set breakpoints on function valuesthat have not been computed at time 0 (see section 16.5).The execution of a program is affected by certain information itreceives when the debugger starts it, such as the command-linearguments to the program and its working directory. The debuggerprovides commands to specify this information (set arguments and cd).These commands must be used before program execution starts. If you tryto change the arguments or the working directory after starting yourprogram, the debugger will kill the program (after asking for confirmation).

16.4.3 Running the program

The following commands execute the program forward or backward,starting at the current time. The execution will stop either whenspecified by the command or when a breakpoint is encountered.

run
Execute the program forward from current time. Stops atnext breakpoint or when the program terminates.
reverse
Execute the program backward from current time.Mostly useful to go to the last breakpoint encountered before thecurrent time.
step [count]
Run the program and stop at the nextevent. With an argument, do it count times. If count is 0,run until the program terminates or a breakpoint is hit.
backstep [count]
Run the program backward and stop atthe previous event. With an argument, do it count times.
next [count]
Run the program and stop at the nextevent, skipping over function calls. With an argument, do itcount times.
previous [count]
Run the program backward and stop atthe previous event, skipping over function calls. With an argument, doit count times.
finish
Run the program until the current function returns.
start
Run the program backward and stop at the first eventbefore the current function invocation.

16.4.4 Time travel

You can jump directly to a given time, without stopping onbreakpoints, using the goto command.

As you move through the program, the debugger maintains an history ofthe successive times you stop at. The last command can be used torevisit these times: each last command moves one step back throughthe history. That is useful mainly to undo commands such as stepand next.

goto time
Jump to the given time.
last [count]
Go back to the latest time recorded in the execution history. With anargument, do it count times.
set history size
Set the size of the execution history.

16.4.5 Killing the program

kill
Kill the program being executed. This command is mainlyuseful if you wish to recompile the program without leaving the debugger.

16.5 Breakpoints

A breakpoint causes the program to stop whenever a certain point inthe program is reached. It can be set in several ways using thebreak command. Breakpoints are assigned numbers when set, forfurther reference. The most comfortable way to set breakpoints isthrough the Emacs interface (see section 16.10).

break
Set a breakpoint at the current position in the program execution. Thecurrent position must be on an event (i.e., neither at the beginning,nor at the end of the program).
break function
Set a breakpoint at the beginning of function. This works onlywhen the functional value of the identifier function has beencomputed and assigned to the identifier. Hence this command cannot beused at the very beginning of the program execution, when allidentifiers are still undefined; use gototime to advanceexecution until the functional value is available.
break @ [module] line
Set a breakpoint in module module (or in the current module ifmodule is not given), at the first event of line line.
break @ [module] line column
Set a breakpoint in module module (or in the current module ifmodule is not given), at the event closest to line line,column column.
break @ [module] #character
Set a breakpoint in module module at the event closest tocharacter number character.
break address
Set a breakpoint at the code address address.
delete [breakpoint-numbers]
Delete the specified breakpoints. Without argument, all breakpointsare deleted (after asking for confirmation).
info breakpoints
Print the list of all breakpoints.

16.6 The call stack

Each time the program performs a function application, it saves thelocation of the application (the return address) in a block of datacalled a stack frame. The frame also contains the local variables ofthe caller function. All the frames are allocated in a region ofmemory called the call stack. The command backtrace (or bt)displays parts of the call stack.

At any time, one of the stack frames is “selected” by the debugger; severaldebugger commands refer implicitly to the selected frame. In particular,whenever you ask the debugger for the value of a local variable, thevalue is found in the selected frame. The commands frame, up and downselect whichever frame you are interested in.

When the program stops, the debugger automatically selects thecurrently executing frame and describes it briefly as the framecommand does.

frame
Describe the currently selected stack frame.
frameframe-number
Select a stack frame by number and describe it. The frame currentlyexecuting when the program stopped has number 0; its caller has number1; and so on up the call stack.
backtrace [count], bt [count]
Print the call stack. This is useful to see which sequence of functioncalls led to the currently executing frame. With a positive argument,print only the innermost count frames.With a negative argument, print only the outermost -count frames.
up [count]
Select and display the stack frame just “above” the selected frame,that is, the frame that called the selected frame. An argument says howmany frames to go up.
down [count]
Select and display the stack frame just “below” the selected frame,that is, the frame that was called by the selected frame. An argumentsays how many frames to go down.

16.7 Examining variable values

The debugger can print the current value of simple expressions. Theexpressions can involve program variables: all the identifiers thatare in scope at the selected program point can be accessed.

Expressions that can be printed are a subset of OCamlexpressions, as described by the following grammar:

simple-expr	::=	lowercase-ident
	∣	{ capitalized-ident . } lowercase-ident
	∣	*
	∣	$ integer
	∣	simple-expr . lowercase-ident
	∣	simple-expr .( integer )
	∣	simple-expr .[ integer ]
	∣	! simple-expr
	∣	( simple-expr )

The first two cases refer to a value identifier, either unqualified orqualified by the path to the structure that define it.* refers to the result just computed (typically, the value of afunction application), and is valid only if the selected event is an“after” event (typically, a function application).$integer refer to a previously printed value. The remaining fourforms select part of an expression: respectively, a record field, anarray element, a string element, and the current contents of areference.

print variables
Print the values of the given variables. print can be abbreviated asp.
display variables
Same as print, but limit the depth of printing to 1. Useful tobrowse large data structures without printing them in full.display can be abbreviated as d.When printing a complex expression, a name of the form $integeris automatically assigned to its value. Such names are also assignedto parts of the value that cannot be printed because the maximalprinting depth is exceeded. Named values can be printed later onwith the commands p $integer or d $integer.Named values are valid only as long as the program is stopped. Theyare forgotten as soon as the program resumes execution.
set print_depthd
Limit the printing of values to a maximal depth of d.
set print_lengthl
Limit the printing of values to at most l nodes printed.

16.8 Controlling the debugger

16.8.1 Setting the program name and arguments

set programfile
Set the program name to file.
set argumentsarguments
Give arguments as command-line arguments for the program.A shell is used to pass the arguments to the debugged program. You cantherefore use wildcards, shell variables, and file redirections insidethe arguments. To debug programs that read from standard input, it isrecommended to redirect their input from a file (usingset arguments < input-file), otherwise input to the program andinput to the debugger are not properly separated, and inputs are notproperly replayed when running the program backwards.

16.8.2 How programs are loaded

The loadingmode variable controls how the program is executed.

set loadingmode direct
The program is run directly by the debugger. This is the default mode.
set loadingmode runtime
The debugger execute the OCaml runtime ocamlrun on the program.Rarely useful; moreover it prevents the debugging of programs compiledin “custom runtime” mode.
set loadingmode manual
The user starts manually the program, when asked by the debugger.Allows remote debugging (see section 16.8.6).

16.8.3 Search path for files

The debugger searches for source files and compiled interface files ina list of directories, the search path. The search path initiallycontains the current directory . and the standard library directory.The directory command adds directories to the path.

Whenever the search path is modified, the debugger will clear anyinformation it may have cached about the files.

directorydirectorynames
Add the given directories to the search path. These directories areadded at the front, and will therefore be searched first.
directorydirectorynamesformodulename
Same as directorydirectorynames, but the given directories will besearched only when looking for the source file of a module that has been packed into modulename.
directory
Reset the search path. This requires confirmation.

16.8.4 Working directory

Each time a program is started in the debugger, it inherits its workingdirectory from the current working directory of the debugger. Thisworking directory is initially whatever it inherited from its parentprocess (typically the shell), but you can specify a new workingdirectory in the debugger with the cd command or the -cdcommand-line option.

cddirectory
Set the working directory for ocamldebug to directory.
pwd
Print the working directory for ocamldebug.

16.8.5 Turning reverse execution on and off

In some cases, you may want to turn reverse execution off. This speedsup the program execution, and is also sometimes useful for interactiveprograms.

Normally, the debugger takes checkpoints of the program state fromtime to time. That is, it makes a copy of the current state of theprogram (using the Unix system call fork). If the variablecheckpoints is set to off, the debugger will not take anycheckpoints.

set checkpointson/off
Select whether the debugger makes checkpoints or not.

16.8.6 Communication between the debugger and the program

The debugger communicate with the program being debugged through aUnix socket. You may need to change the socket name, for example ifyou need to run the debugger on a machine and your program on another.

set socketsocket
Use socket for communication with the program. socket can beeither a file name, or an Internet port specificationhost:port, where host is a host name or an Internetaddress in dot notation, and port is a port number on the host.On the debugged program side, the socket name is passed through theCAML_DEBUG_SOCKET environment variable.

16.8.7 Fine-tuning the debugger

Several variables enables to fine-tune the debugger. Reasonabledefaults are provided, and you should normally not have to change them.

set processcountcount
Set the maximum number of checkpoints to count. More checkpointsfacilitate going far back in time, but use more memory and create moreUnix processes.As checkpointing is quite expensive, it must not be done too often. Onthe other hand, backward execution is faster when checkpoints aretaken more often. In particular, backward single-stepping is moreresponsive when many checkpoints have been taken just before thecurrent time. To fine-tune the checkpointing strategy, the debuggerdoes not take checkpoints at the same frequency for long displacements(e.g. run) and small ones (e.g. step). The two variables bigstepand smallstep contain the number of events between two checkpointsin each case.
set bigstepcount
Set the number of events between two checkpoints for long displacements.
set smallstepcount
Set the number of events between two checkpoints for smalldisplacements.The following commands display information on checkpoints and events:
info checkpoints
Print a list of checkpoints.
info events [module]
Print the list of events in the given module (the current module, by default).

16.8.8 User-defined printers

Just as in the toplevel system (section 10.2),the user can register functions for printing values of certain types.For technical reasons, the debugger cannot call printing functionsthat reside in the program being debugged. The code for the printingfunctions must therefore be loaded explicitly in the debugger.

load_printer "file-name"
Load in the debugger the indicated .cmo or .cma object file. Thefile is loaded in an environment consisting only of the OCamlstandard library plus the definitions provided by object filespreviously loaded using load_printer. If this file depends on otherobject files not yet loaded, the debugger automatically loads them ifit is able to find them in the search path. The loaded file does nothave direct access to the modules of the program being debugged.
install_printer printer-name
Register the function named printer-name (avalue path) as a printer for objects whose types match the argumenttype of the function. That is, the debugger will callprinter-name when it has such an object to print.The printing function printer-name must use the Format librarymodule to produce its output, otherwise its output will not becorrectly located in the values printed by the toplevel loop.The value path printer-name must refer to one of the functionsdefined by the object files loaded using load_printer. It cannotreference the functions of the program being debugged.
remove_printer printer-name
Remove the named function from the table of value printers.

16.9 Miscellaneous commands

list [module] [beginning] [end]
List the source of module module, from line numberbeginning to line number end. By default, 20 lines of thecurrent module are displayed, starting 10 lines before the currentposition.
sourcefilename
Read debugger commands from the script filename.

16.10 Running the debugger under Emacs

The most user-friendly way to use the debugger is to run it under Emacs.See the file emacs/README in the distribution for information on howto load the Emacs Lisp files for OCaml support.

The OCaml debugger is started under Emacs by the command M-x camldebug, with argument the name of the executable fileprogname to debug. Communication with the debugger takes placein an Emacs buffer named camldebug-progname. The editingand history facilities of Shell mode are available for interactingwith the debugger.

In addition, Emacs displays the source files containing the currentevent (the current position in the program execution) and highlightsthe location of the event. This display is updated synchronously withthe debugger action.

The following bindings for the most common debugger commands areavailable in the camldebug-progname buffer:

C-c C-s
(command step): execute the program one step forward.
C-c C-k
(command backstep): execute the program one step backward.
C-c C-n
(command next): execute the program one stepforward, skipping over function calls.
Middle mouse button
(command display): display named value.$n under mouse cursor (support incremental browsing of largedata structures).
C-c C-p
(command print): print value of identifier at point.
C-c C-d
(command display): display value of identifier at point.
C-c C-r
(command run): execute the program forward to nextbreakpoint.
C-c C-v
(command reverse): execute the program backward tolatest breakpoint.
C-c C-l
(command last): go back one step in the command history.
C-c C-t
(command backtrace): display backtrace of function calls.
C-c C-f
(command finish): run forward till the currentfunction returns.
C-c <
(command up): select the stack frame below thecurrent frame.
C-c >
(command down): select the stack frame above thecurrent frame.In all buffers in OCaml editing mode, the following debugger commandsare also available:
C-x C-a C-b
(command break): set a breakpoint at event closestto point
C-x C-a C-p
(command print): print value of identifier at point
C-x C-a C-d
(command display): display value of identifier at point