1 DRAFT DRAFT DRAFT WORK IN PROGRESS DRAFT DRAFT DRAFT
3 This is work in progress and likely to change.
6 Roland McGrath <roland@redhat.com>
10 User Debugging Data & Event Rendezvous
11 ---- --------- ---- - ----- ----------
13 See linux/utrace.h for all the declarations used here.
14 See also linux/tracehook.h for the utrace_regset declarations.
16 The UTRACE is infrastructure code for tracing and controlling user
17 threads. This is the foundation for writing tracing engines, which
18 can be loadable kernel modules. The UTRACE interfaces provide three
21 * Thread event reporting
23 Tracing engines can request callbacks for events of interest in
24 the thread: signals, system calls, exit, exec, clone, etc.
28 Tracing engines can prevent a thread from running (keeping it in
29 TASK_TRACED state), or make it single-step or block-step (when
30 hardware supports it). Engines can cause a thread to abort system
31 calls, they change the behaviors of signals, and they can inject
32 signal-style actions at will.
34 * Thread machine state access
36 Tracing engines can read and write a thread's registers and
37 similar per-thread CPU state.
43 The basic actors in UTRACE are the thread and the tracing engine.
44 A tracing engine is some body of code that calls into the utrace_*
45 interfaces, represented by a struct utrace_engine_ops. (Usually it's a
46 kernel module, though the legacy ptrace support is a tracing engine
47 that is not in a kernel module.) The UTRACE interface operates on
48 individual threads (struct task_struct). If an engine wants to
49 treat several threads as a group, that is up to its higher-level
50 code. Using the UTRACE starts out by attaching an engine to a thread.
52 struct utrace_attached_engine *
53 utrace_attach(struct task_struct *target, int flags,
54 const struct utrace_engine_ops *ops, unsigned long data);
56 Calling utrace_attach is what sets up a tracing engine to trace a
57 thread. Use UTRACE_ATTACH_CREATE in flags, and pass your engine's ops.
58 Check the return value with IS_ERR. If successful, it returns a
59 struct pointer that is the handle used in all other utrace_* calls.
60 The data argument is stored in the utrace_attached_engine structure,
61 for your code to use however it wants.
63 void utrace_detach(struct task_struct *target,
64 struct utrace_attached_engine *engine);
66 The utrace_detach call removes an engine from a thread.
67 No more callbacks will be made after this returns.
70 An attached engine does nothing by default.
71 An engine makes something happen by setting its flags.
73 void utrace_set_flags(struct task_struct *target,
74 struct utrace_attached_engine *engine,
81 There are two kinds of flags that an attached engine can set: event
82 flags, and action flags. Event flags register interest in particular
83 events; when an event happens and an engine has the right event flag
84 set, it gets a callback. Action flags change the normal behavior of
85 the thread. The action flags available are:
89 The thread will stay quiescent (see below).
90 As long as any engine asserts the QUIESCE action flag,
91 the thread will not resume running in user mode.
92 (Usually it will be in TASK_TRACED state.)
93 Nothing will wake the thread up except for SIGKILL
94 (and implicit SIGKILLs such as a core dump in
95 another thread sharing the same address space, or a
96 group exit or fatal signal in another thread in the
99 UTRACE_ACTION_SINGLESTEP
101 When the thread runs, it will run one instruction
102 and then trap. (Exiting a system call or entering a
103 signal handler is considered "an instruction" for this.)
104 This can be used only if ARCH_HAS_SINGLE_STEP #define'd
105 by <asm/tracehook.h> and evaluates to nonzero.
107 UTRACE_ACTION_BLOCKSTEP
109 When the thread runs, it will run until the next branch,
110 and then trap. (Exiting a system call or entering a
111 signal handler is considered a branch for this.)
112 When the SINGLESTEP flag is set, BLOCKSTEP has no effect.
113 This is only available on some machines (actually none yet).
114 This can be used only if ARCH_HAS_BLOCK_STEP #define'd
115 by <asm/tracehook.h> and evaluates to nonzero.
119 When the thread exits or stops for job control, its
120 parent process will not receive a SIGCHLD and the
121 parent's wait calls will not wake up or report the
122 child as dead. A well-behaved tracing engine does not
123 want to interfere with the parent's normal notifications.
124 This is provided mainly for the ptrace compatibility
125 code to implement the traditional behavior.
127 Event flags are specified using the macro UTRACE_EVENT(TYPE).
128 Each event type is associated with a report_* callback in struct
129 utrace_engine_ops. A tracing engine can leave unused callbacks NULL.
130 The only callbacks required are those used by the event flags it sets.
132 Many engines can be attached to each thread. When a thread has an
133 event, each engine gets a report_* callback if it has set the event flag
134 for that event type. Engines are called in the order they attached.
136 Each callback takes arguments giving the details of the particular
137 event. The first two arguments two every callback are the struct
138 utrace_attached_engine and struct task_struct pointers for the engine
139 and the thread producing the event. Usually this will be the current
140 thread that is running the callback functions.
142 The return value of report_* callbacks is a bitmask. Some bits are
143 common to all callbacks, and some are particular to that callback and
144 event type. The value zero (UTRACE_ACTION_RESUME) always means the
145 simplest thing: do what would have happened with no tracing engine here.
146 These are the flags that can be set in any report_* return value:
148 UTRACE_ACTION_NEWSTATE
150 Update the action state flags, described above. Those
151 bits from the return value (UTRACE_ACTION_STATE_MASK)
152 replace those bits in the engine's flags. This has the
153 same effect as calling utrace_set_flags, but is a more
154 efficient short-cut. To change the event flags, you must
155 call utrace_set_flags.
159 Detach this engine. This has the effect of calling
160 utrace_detach, but is a more efficient short-cut.
164 Hide this event from other tracing engines. This is
165 only appropriate to do when the event was induced by
166 some action of this engine, such as a breakpoint trap.
167 Some events cannot be hidden, since every engine has to
168 know about them: exit, death, reap.
170 The return value bits in UTRACE_ACTION_OP_MASK indicate a change to the
171 normal behavior of the event taking place. If zero, the thread does
172 whatever that event normally means. For report_signal, other values
173 control the disposition of the signal.
179 To control another thread and access its state, it must be "quiescent".
180 This means that it is stopped and won't start running again while we access
181 it. A quiescent thread is stopped in a place close to user mode, where the
182 user state can be accessed safely; either it's about to return to user
183 mode, or it's just entered the kernel from user mode, or it has already
184 finished exiting (TASK_ZOMBIE). Setting the UTRACE_ACTION_QUIESCE action
185 flag will force the attached thread to become quiescent soon. After
186 setting the flag, an engine must wait for an event callback when the thread
187 becomes quiescent. The thread may be running on another CPU, or may be in
188 an uninterruptible wait. When it is ready to be examined, it will make
189 callbacks to engines that set the UTRACE_EVENT(QUIESCE) event flag.
191 As long as some engine has UTRACE_ACTION_QUIESCE set, then the thread will
192 remain stopped. SIGKILL will wake it up, but it will not run user code.
193 When the flag is cleared via utrace_set_flags or a callback return value,
194 the thread starts running again.
196 During the event callbacks (report_*), the thread in question makes the
197 callback from a safe place. It is not quiescent, but it can safely access
198 its own state. Callbacks can access thread state directly without setting
199 the QUIESCE action flag. If a callback does want to prevent the thread
200 from resuming normal execution, it *must* use the QUIESCE action state
201 rather than simply blocking; see "Core Events & Callbacks", below.
207 These calls must be made on a quiescent thread (or the current thread):
209 int utrace_inject_signal(struct task_struct *target,
210 struct utrace_attached_engine *engine,
211 u32 action, siginfo_t *info,
212 const struct k_sigaction *ka);
214 Cause a specified signal delivery in the target thread. This is not
215 like kill, which generates a signal to be dequeued and delivered later.
216 Injection directs the thread to deliver a signal now, before it next
217 resumes in user mode or dequeues any other pending signal. It's as if
218 the tracing engine intercepted a signal event and its report_signal
219 callback returned the action argument as its value (see below). The
220 info and ka arguments serve the same purposes as their counterparts in
221 a report_signal callback.
223 const struct utrace_regset *
224 utrace_regset(struct task_struct *target,
225 struct utrace_attached_engine *engine,
226 const struct utrace_regset_view *view,
229 Get access to machine state for the thread. The struct utrace_regset_view
230 indicates a view of machine state, corresponding to a user mode
231 architecture personality (such as 32-bit or 64-bit versions of a machine).
232 The which argument selects one of the register sets available in that view.
233 The utrace_regset call must be made before accessing any machine state,
234 each time the thread has been running and has then become quiescent.
235 It ensures that the thread's state is ready to be accessed, and returns
236 the struct utrace_regset giving its accessor functions.
238 XXX needs front ends for argument checks, export utrace_native_view
241 Core Events & Callbacks
242 ---- ------ - ---------
244 Event reporting callbacks have details particular to the event type, but
245 are all called in similar environments and have the same constraints.
246 Callbacks are made from safe spots, where no locks are held, no special
247 resources are pinned, and the user-mode state of the thread is accessible.
248 So, callback code has a pretty free hand. But to be a good citizen,
249 callback code should never block for long periods. It is fine to block in
250 kmalloc and the like, but never wait for i/o or for user mode to do
251 something. If you need the thread to wait, set UTRACE_ACTION_QUIESCE and
252 return from the callback quickly. When your i/o finishes or whatever, you
253 can use utrace_set_flags to resume the thread.
255 Well-behaved callbacks are important to maintain two essential properties
256 of the interface. The first of these is that unrelated tracing engines not
257 interfere with each other. If your engine's event callback does not return
258 quickly, then another engine won't get the event notification in a timely
259 manner. The second important property is that tracing be as noninvasive as
260 possible to the normal operation of the system overall and of the traced
261 thread in particular. That is, attached tracing engines should not perturb
262 a thread's behavior, except to the extent that changing its user-visible
263 state is explicitly what you want to do. (Obviously some perturbation is
264 unavoidable, primarily timing changes, ranging from small delays due to the
265 overhead of tracing, to arbitrary pauses in user code execution when a user
266 stops a thread with a debugger for examination. When doing asynchronous
267 utrace_attach to a thread doing a system call, more troublesome side
268 effects are possible.) Even when you explicitly want the pertrubation of
269 making the traced thread block, just blocking directly in your callback has
270 more unwanted effects. For example, the CLONE event callbacks are called
271 when the new child thread has been created but not yet started running; the
272 child can never be scheduled until the CLONE tracing callbacks return.
273 (This allows engines tracing the parent to attach to the child.) If a
274 CLONE event callback blocks the parent thread, it also prevents the child
275 thread from running (even to process a SIGKILL). If what you want is to
276 make both the parent and child block, then use utrace_attach on the child
277 and then set the QUIESCE action state flag on both threads. A more crucial
278 problem with blocking in callbacks is that it can prevent SIGKILL from
279 working. A thread that is blocking due to UTRACE_ACTION_QUIESCE will still
280 wake up and die immediately when sent a SIGKILL, as all threads should.
281 Relying on the utrace infrastructure rather than on private synchronization
282 calls in event callbacks is an important way to help keep tracing robustly
286 EVENT(REAP) Dead thread has been reaped
288 void (*report_reap)(struct utrace_attached_engine *engine,
289 struct task_struct *tsk);
291 This means the parent called wait, or else this was a detached thread or
292 a process whose parent ignores SIGCHLD. This cannot happen while the
293 UTRACE_ACTION_NOREAP flag is set. This is the only callback you are
294 guaranteed to get (if you set the flag).
296 Unlike other callbacks, this can be called from the parent's context
297 rather than from the traced thread itself--it must not delay the parent by
298 blocking. This callback is different from all others, it returns void.
299 Once you get this callback, your engine is automatically detached and you
300 cannot access this thread or use this struct utrace_attached_engine handle
301 any longer. This is the place to clean up your data structures and
302 synchronize with your code that might try to make utrace_* calls using this
303 engine data structure. The struct is still valid during this callback,
304 but will be freed soon after it returns (via RCU).
306 In all other callbacks, the return value is as described above.
307 The common UTRACE_ACTION_* flags in the return value are always observed.
308 Unless otherwise specified below, other bits in the return value are ignored.
311 EVENT(QUIESCE) Thread is quiescent
313 u32 (*report_quiesce)(struct utrace_attached_engine *engine,
314 struct task_struct *tsk);
316 This is the least interesting callback. It happens at any safe spot,
317 including after any other event callback. This lets the tracing engine
318 know that it is safe to access the thread's state, or to report to users
319 that it has stopped running user code.
321 EVENT(CLONE) Thread is creating a child
323 u32 (*report_clone)(struct utrace_attached_engine *engine,
324 struct task_struct *parent,
325 unsigned long clone_flags,
326 struct task_struct *child);
328 A clone/clone2/fork/vfork system call has succeeded in creating a new
329 thread or child process. The new process is fully formed, but not yet
330 running. During this callback, other tracing engines are prevented from
331 using utrace_attach asynchronously on the child, so that engines tracing
332 the parent get the first opportunity to attach. After this callback
333 returns, the child will start and the parent's system call will return.
334 If CLONE_VFORK is set, the parent will block before returning.
336 EVENT(VFORK_DONE) Finished waiting for CLONE_VFORK child
338 u32 (*report_vfork_done)(struct utrace_attached_engine *engine,
339 struct task_struct *parent, pid_t child_pid);
341 Event reported for parent using CLONE_VFORK or vfork system call.
342 The child has died or exec'd, so the vfork parent has unblocked
343 and is about to return child_pid.
345 UTRACE_EVENT(EXEC) Completed exec
347 u32 (*report_exec)(struct utrace_attached_engine *engine,
348 struct task_struct *tsk,
349 const struct linux_binprm *bprm,
350 struct pt_regs *regs);
352 An execve system call has succeeded and the new program is about to
353 start running. The initial user register state is handy to be tweaked
354 directly, or utrace_regset can be used for full machine state access.
356 UTRACE_EVENT(EXIT) Thread is exiting
358 u32 (*report_exit)(struct utrace_attached_engine *engine,
359 struct task_struct *tsk,
360 long orig_code, long *code);
362 The thread is exiting and cannot be prevented from doing so, but all its
363 state is still live. The *code value will be the wait result seen by
364 the parent, and can be changed by this engine or others. The orig_code
365 value is the real status, not changed by any tracing engine.
367 UTRACE_EVENT(DEATH) Thread has finished exiting
369 u32 (*report_death)(struct utrace_attached_engine *engine,
370 struct task_struct *tsk);
372 The thread is really dead now. If the UTRACE_ACTION_NOREAP flag is set
373 after this callback, it remains an unreported zombie. Otherwise, it might
374 be reaped by its parent, or self-reap immediately. Though the actual
375 reaping may happen in parallel, a report_reap callback will always be
376 ordered after a report_death callback.
378 UTRACE_EVENT(SYSCALL_ENTRY) Thread has entered kernel for a system call
380 u32 (*report_syscall_entry)(struct utrace_attached_engine *engine,
381 struct task_struct *tsk,
382 struct pt_regs *regs);
384 The system call number and arguments can be seen and modified in the
385 registers. The return value register has -ENOSYS, which will be
386 returned for an invalid system call. The macro tracehook_abort_syscall(regs)
387 will abort the system call so that we go immediately to syscall exit,
388 and return -ENOSYS (or whatever the register state is changed to). If
389 tracing enginges keep the thread quiescent here, the system call will
390 not be performed until it resumes.
392 UTRACE_EVENT(SYSCALL_EXIT) Thread is leaving kernel after a system call
394 u32 (*report_syscall_exit)(struct utrace_attached_engine *engine,
395 struct task_struct *tsk,
396 struct pt_regs *regs);
398 The return value can be seen and modified in the registers. If the
399 thread is allowed to resume, it will see any pending signals and then
402 UTRACE_EVENT(SIGNAL) Signal caught by user handler
403 UTRACE_EVENT(SIGNAL_IGN) Signal with no effect (SIG_IGN or default)
404 UTRACE_EVENT(SIGNAL_STOP) Job control stop signal
405 UTRACE_EVENT(SIGNAL_TERM) Fatal termination signal
406 UTRACE_EVENT(SIGNAL_CORE) Fatal core-dump signal
407 UTRACE_EVENT_SIGNAL_ALL All of the above (bitmask)
409 u32 (*report_signal)(struct utrace_attached_engine *engine,
410 struct task_struct *tsk,
411 u32 action, siginfo_t *info,
412 const struct k_sigaction *orig_ka,
413 struct k_sigaction *return_ka);
415 There are five types of signal events, but all use the same callback.
416 These happen when a thread is dequeuing a signal to be delivered.
417 (Not immediately when the signal is sent, and not when the signal is
418 blocked.) No signal event is reported for SIGKILL; no tracing engine
419 can prevent it from killing the thread immediately. The specific
420 event types allow an engine to trace signals based on what they do.
421 UTRACE_EVENT_SIGNAL_ALL is all of them OR'd together, to trace all
422 signals (except SIGKILL). A subset of these event flags can be used
423 e.g. to catch only fatal signals, not handled ones, or to catch only
424 core-dump signals, not normal termination signals.
426 The action argument says what the signal's default disposition is:
428 UTRACE_SIGNAL_DELIVER Run the user handler from sigaction.
429 UTRACE_SIGNAL_IGN Do nothing, ignore the signal.
430 UTRACE_SIGNAL_TERM Terminate the process.
431 UTRACE_SIGNAL_CORE Terminate the process a write a core dump.
432 UTRACE_SIGNAL_STOP Absolutely stop the process, a la SIGSTOP.
433 UTRACE_SIGNAL_TSTP Job control stop (no stop if orphaned).
435 This selection is made from consulting the process's sigaction and the
436 default action for the signal number, but may already have been
437 changed by an earlier tracing engine (in which case you see its override).
438 A return value of UTRACE_ACTION_RESUME means to carry out this action.
439 If instead UTRACE_SIGNAL_* bits are in the return value, that overrides
440 the normal behavior of the signal.
442 The signal number and other details of the signal are in info, and
443 this data can be changed to make the thread see a different signal.
444 A return value of UTRACE_SIGNAL_DELIVER says to follow the sigaction in
445 return_ka, which can specify a user handler or SIG_IGN to ignore the
446 signal or SIG_DFL to follow the default action for info->si_signo.
447 The orig_ka parameter shows the process's sigaction at the time the
448 signal was dequeued, and return_ka initially contains this. Tracing
449 engines can modify return_ka to change the effects of delivery.
450 For other UTRACE_SIGNAL_* return values, return_ka is ignored.
452 UTRACE_SIGNAL_HOLD is a flag bit that can be OR'd into the return
453 value. It says to push the signal back on the thread's queue, with
454 the signal number and details possibly changed in info. When the
455 thread is allowed to resume, it will dequeue and report it again.