LIBPFM - Online Linux Manual PageSection : 3
Updated : January, 2009
Source :
Note : Linux Programmer's Manual
NAMElibpfm_nehalem - support for Intel Nehalem processor family
SYNOPSIS#include <perfmon/pfmlib.h>
#include <perfmon/pfmlib_intel_nhm.h>
DESCRIPTION
The libpfm library provides full support for the Intel Nehalem processor family, such as Intel Core i7. The interface is defined in pfmlib_intel_nhm.h. It consists of a set of functions and structures describing the Intel Nehalem processor specific PMU features. The Intel Nehalem processor is a quad core, dual thread processor. It includes two types of PMU: core and uncore. The latter measures events at the socket level and is therefore disconnected from any of the four cores. The core PMU implements Intel architectural perfmon version 3 with four generic counters and three fixed counters. The uncore has eight generic counters and one fixed counter. Each Intel Nehalem core also implement a 16-deep branch trace buffer, called Last Branch Record (LBR), which can be used in combination with the core PMU. Intel Nehalem implements a newer version of the Precise Event-Based Sampling (PEBS) mechanism which has the ability to capture where cache misses occur. When Intel Nehalem processor specific features are needed to support a measurement, their descriptions must be passed as model-specific input arguments to the pfm_dispatch_events() function. The Intel Nehalem processors specific input arguments are described in the pfmlib_nhm_input_param_t structure. No output parameters are currently defined. The input parameters are defined as follows: typedef struct {
unsigned long cnt_mask;
unsigned int flags;
} pfmlib_nhm_counter_t;
typedef struct {
unsigned int lbr_used;
unsigned int lbr_plm;
unsigned int lbr_filter;
} pfmlib_nhm_lbr_t;
typedef struct {
unsigned int pebs_used;
unsigned int ld_lat_thres;
} pfmlib_nhm_pebs_t;
typedef struct {
pfmlib_nhm_counter_t pfp_nhm_counters[PMU_NHM_NUM_COUNTERS];
pfmlib_nhm_pebs_t pfp_nhm_pebs;
pfmlib_nhm_lbr_t pfm_nhm_lbr;
uint64_t reserved[4];
} pfmlib_nhm_input_param_t;The Intel Nehalem processor provides a few additional per-event features for counters: thresholding, inversion, edge detection, monitoring of both threads, occupancy. They can be set using the pfp_nhm_counters data structure for each event. The flags field can be initialized with the following values, depending on the event: PFMLIB_NHM_SEL_INV Inverse the results of the cnt_mask comparison when set. This flag is supported for core and uncore PMU events. PFMLIB_NHM_SEL_EDGE Enables edge detection of events. This flag is supported for core and uncore PMU events. PFMLIB_NHM_SEL_ANYTHR Enable measuring the event in any of the two processor threads assuming hyper-threading is enabled. By default, only the current thread is measured. This flag is restricted to core PMU events. PFMLIB_NHM_SEL_OCC_RST When set, the queue occupancy counter associated with the event is cleared. This flag is only available to uncore PMU events. The cnt_mask field is used to set the event threshold. The value of the counter is incremented for each cycle in which the number of occurrences of the event is greater or equal to the value of the field. Thus, the event is modified to actually measure the number of qualifying cycles. When zero all occurrences are counted (this is the default). This flag is supported for core and uncore PMU events.
Support for Precise-Event Based Sampling (PEBS)The library can be used to setup the PMC registers associated with PEBS. In this case, the pfp_nhm_pebs_t structure must be used and the pebs_used field must be set to 1. To enable the PEBS load latency filtering capability, it is necessary to program the MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD event into one generic counter. The latency threshold must be passed to the library in the ld_lat_thres field. It is expressed in core cycles and must greater than 3. Note that pebs_used must be set as well.
Support for Last Branch Record (LBR)The library can be used to setup LBR registers. On Intel Nehalem processors, the LBR is 16-entry deep and it is possible to filter branches, based on privilege level or type. To configure the LBR, the pfm_nhm_lbr_t structure must be used. Like core PMU counters, LBR only distinguishes two privilege levels, 0 and the rest (1,2,3). When running Linux natively, the kernel is at privilege level 0, applications at level 3. It is possible to specify the privilege level of LBR using the lbr_plm. Any attempt to pass PFM_PLM1 or PFM_PLM2 will be rejected. If \lbr_plm\fR is 0, then the global value in pfmlib_input_param_t and the pfp_dfl_plm is used. By default, LBR captures all branches. It is possible to filter out branches by passing a set of flags in lbr_select. The flags are as follows: PFMLIB_NHM_LBR_JCC When set, LBR does not capture conditional branches. Default: off. PFM_NHM_LBR_NEAR_REL_CALL When set, LBR does not capture near calls. Default: off. PFM_NHM_LBR_NEAR_IND_CALL When set, LBR does not capture indirect calls. Default: off. PFM_NHM_LBR_NEAR_RET When set, LBR does not capture return branches. Default: off. PFM_NHM_LBR_NEAR_IND_JMP When set, LBR does not capture indirect branches. Default: off. PFM_NHM_LBR_NEAR_REL_JMP When set, LBR does not capture relative branches. Default: off. PFM_NHM_LBR_FAR_BRANCH When set, LBR does not capture far branches. Default: off.
Support for uncore PMUBy nature, the uncore PMU does not distinguish privilege levels, therefore it captures events at all privilege levels. To avoid any misinterpretation, the library enforces that uncore events be measured with both PFM_PLM0 and PFM_PLM3 set. Tools and operating system kernel interfaces may impose further restrictions on how the uncore PMU can be accessed.
SEE ALSOpfm_dispatch_events(3) and set of examples shipped with the library
AUTHORStephane Eranian <eranian@gmail.com> 0
Johanes Gumabo
Data Size : 14,160 byte
man-libpfm_nehalem.3Build : 2024-12-05, 20:55 :
Visitor Screen : x
Visitor Counter ( page / site ) : 2 / 200,948
Visitor ID : :
Visitor IP : 3.142.172.250 :
Visitor Provider : AMAZON-02 :
Provider Position ( lat x lon ) : 39.962500 x -83.006100 : x
Provider Accuracy Radius ( km ) : 1000 :
Provider City : Columbus :
Provider Province : Ohio , : ,
Provider Country : United States :
Provider Continent : North America :
Visitor Recorder : Version :
Visitor Recorder : Library :
Online Linux Manual Page : Version : Online Linux Manual Page - Fedora.40 - march=x86-64 - mtune=generic - 24.12.05
Online Linux Manual Page : Library : lib_c - 24.10.03 - march=x86-64 - mtune=generic - Fedora.40
Online Linux Manual Page : Library : lib_m - 24.10.03 - march=x86-64 - mtune=generic - Fedora.40
Data Base : Version : Online Linux Manual Page Database - 24.04.13 - march=x86-64 - mtune=generic - fedora-38
Data Base : Library : lib_c - 23.02.07 - march=x86-64 - mtune=generic - fedora.36
Very long time ago, I have the best tutor, Wenzel Svojanovsky . If someone knows the email address of Wenzel Svojanovsky , please send an email to johanes_gumabo@yahoo.co.id .
If error, please print screen and send to johanes_gumabo@yahoo.co.id
Under development. Support me via PayPal.
ERROR : Need New Coding : (rof_L|13|libpfm_nehalem.3|105|\lbr_plm\fR is 0, then the global |to pass \fBPFM_PLM1\fB or \fBPFM_PLM2\fR will be rejected. If \fB\lbr_plm\fR is 0, then the global
)