API Based Attention Sensing

The GazeSense software allows an external application to extract the resulting attention sensing
signal in real time, by using our API.

Currently we provide a Python and C++ API, for which we give usage examples in the installation folder.

C:\Program Files (x86)\Eyeware\GazeSense <version>\API\python (Windows version) <path_of_extracted_GazeSense_folder>/API/python (Linux version)

Please refer to it for further usage details. This API allows the creation of several targets ofvarious shapes, and to permit movement to them, as exemplified in the examples in the above-
mentioned folder. The following menu is available through Setup→API.

The Port field indicates the network port over which the GazeSense app will communicate with a client application. GazeSense will always publish, in real-time, the tracking information to a
client through that port. However, GazeSense will only listen to the client application when the Attention Context Control is set to API. See also the GazeSenseExample.py script, for an example on how to receive data from the GazeSense app and how to configure the 3D setup from the client application.

Data structure of the API output

The GazeSenseClient method allows storing a Python dictionary as output of the attention
sensing session. This dictionary contains the following keys (fields):

  • InTracking – boolean that indicates whether a subject is being currently tracked
    ConnectionOK – boolean that indicates whether the client was successful in establishing a connection to the GazeSense application or not
    head_pose – Rotation matrix and translation vector of the estimated face model, with respect to the reference point
    nose_tip – 3D location of the subject’s nose tip
    GazeCoding – string with the label of the gazed object
    Head Attention Points, Head Attention Scores – head based intersection points and measured scores for each target
    Gaze Attention Points, Gaze Attention Scores – gaze based intersection points and measured scores for each target
    point_of_regard_3D – estimated 3D point where the subject is paying attention, calculated from a consensus between the Gaze and the Head Attention data
    new_tracking_data – boolean that announces new data to track
    timestamp – seconds passed since an arbitrary reference
  • screen_gaze_coordinates – 2D space conversion of the point of regard, for screen- like targets, to be understood as the screen pixel being gazed
  • rgb_video_frame_buffer – provides a raw buffer containing the row-wise RGB ordered pixels of the color video of the next frame, after the function request_next_video_frame() has been called

Adding new targets
The GazeSense Python API supports the addition of several 3D shaped targets, here described by the methods used for their creation:

PointGazeTarget – for a simple 3D point in space
PlanarGazeTarget – for a rectangular 3D plane section, width and height to specify
(in meters)
ScreenGazeTarget – for another rectangular 3D plane section, specified by width and
height in pixels, and the screen diagonal in inches
CylinderGazeTarget – a cylinder shaped target, whose radius and height can be

All the above targets can be labeled with a string, and their rotation and translation arrays specified. Please note that positional values must be defined in meters, angular values in radians.

Tip: See the GazeSenseExample.py script, where point-like targets are added, and manipulating their 3D position with the time library helps to simulate their movement.

Editing and removing targets
In the API Based attention sensing mode, the targets’ information and the camera settings are sent to the GazeSense through TCP sockets, by using the GazeSenseClient class methods send_gaze_targets_list() and send_camera_pose() in a continuous loop. This allows the editing of target attributes (i.e., simulate movement by altering a target’s rotation and translation, resize the target), and also their removal from the target list.