Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Rust implementation for Python client's update_run() method. #1314

Merged
merged 2 commits into from
Dec 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion python/langsmith/client.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Client for interacting with the LangSmith API.

Check notice on line 1 in python/langsmith/client.py

View workflow job for this annotation

GitHub Actions / benchmark

Benchmark results

........... WARNING: the benchmark result may be unstable * the standard deviation (87.9 ms) is 13% of the mean (681 ms) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. create_5_000_run_trees: Mean +- std dev: 681 ms +- 88 ms ........... WARNING: the benchmark result may be unstable * the standard deviation (147 ms) is 11% of the mean (1.37 sec) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. create_10_000_run_trees: Mean +- std dev: 1.37 sec +- 0.15 sec ........... WARNING: the benchmark result may be unstable * the standard deviation (165 ms) is 12% of the mean (1.39 sec) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. create_20_000_run_trees: Mean +- std dev: 1.39 sec +- 0.17 sec ........... dumps_class_nested_py_branch_and_leaf_200x400: Mean +- std dev: 696 us +- 6 us ........... dumps_class_nested_py_leaf_50x100: Mean +- std dev: 24.9 ms +- 0.2 ms ........... dumps_class_nested_py_leaf_100x200: Mean +- std dev: 104 ms +- 2 ms ........... dumps_dataclass_nested_50x100: Mean +- std dev: 25.2 ms +- 0.1 ms ........... WARNING: the benchmark result may be unstable * the standard deviation (15.8 ms) is 22% of the mean (70.9 ms) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. dumps_pydantic_nested_50x100: Mean +- std dev: 70.9 ms +- 15.8 ms ........... dumps_pydanticv1_nested_50x100: Mean +- std dev: 195 ms +- 3 ms

Check notice on line 1 in python/langsmith/client.py

View workflow job for this annotation

GitHub Actions / benchmark

Comparison against main

+-----------------------------------------------+----------+------------------------+ | Benchmark | main | changes | +===============================================+==========+========================+ | dumps_pydanticv1_nested_50x100 | 221 ms | 195 ms: 1.13x faster | +-----------------------------------------------+----------+------------------------+ | create_5_000_run_trees | 724 ms | 681 ms: 1.06x faster | +-----------------------------------------------+----------+------------------------+ | create_10_000_run_trees | 1.40 sec | 1.37 sec: 1.03x faster | +-----------------------------------------------+----------+------------------------+ | dumps_dataclass_nested_50x100 | 25.6 ms | 25.2 ms: 1.02x faster | +-----------------------------------------------+----------+------------------------+ | dumps_class_nested_py_branch_and_leaf_200x400 | 705 us | 696 us: 1.01x faster | +-----------------------------------------------+----------+------------------------+ | dumps_class_nested_py_leaf_100x200 | 105 ms | 104 ms: 1.01x faster | +-----------------------------------------------+----------+------------------------+ | dumps_class_nested_py_leaf_50x100 | 25.1 ms | 24.9 ms: 1.01x faster | +-----------------------------------------------+----------+------------------------+ | create_20_000_run_trees | 1.39 sec | 1.39 sec: 1.00x faster | +-----------------------------------------------+----------+------------------------+ | dumps_pydantic_nested_50x100 | 66.2 ms | 70.9 ms: 1.07x slower | +-----------------------------------------------+----------+------------------------+ | Geometric mean | (ref) | 1.02x faster | +-----------------------------------------------+----------+------------------------+

Use the client to customize API keys / workspace ocnnections, SSl certs,
etc. for tracing.
Expand Down Expand Up @@ -1752,7 +1752,10 @@
data["events"] = events
if data["extra"]:
self._insert_runtime_env([data])
if use_multipart and self.tracing_queue is not None:

if self._pyo3_client is not None:
self._pyo3_client.update_run(data)
elif use_multipart and self.tracing_queue is not None:
# not collecting attachments currently, use empty dict
serialized_op = serialize_run_dict(operation="patch", payload=data)
self.tracing_queue.put(
Expand Down
14 changes: 12 additions & 2 deletions rust/crates/langsmith-pyo3/src/blocking_tracing_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,7 @@ impl BlockingTracingClient {
Ok(Self { client: Arc::from(client) })
}

// N.B.: We use `Py<Self>` so that we don't hold the GIL while running this method.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GIL info here was outdated, so I removed it. The current implementation drops the GIL explicitly by calling allow_threads().

// `slf.get()` below is only valid if the `Self` type is `Sync` and `pyclass(frozen)`,
// N.B.: `slf.get()` below is only valid if the `Self` type is `Sync` and `pyclass(frozen)`,
// which is enforced at compile-time.
pub fn create_run(
slf: &Bound<'_, Self>,
Expand All @@ -59,6 +58,17 @@ impl BlockingTracingClient {
.map_err(|e| into_py_err(slf.py(), e))
}

// N.B.: `slf.get()` below is only valid if the `Self` type is `Sync` and `pyclass(frozen)`,
// which is enforced at compile-time.
pub fn update_run(
slf: &Bound<'_, Self>,
run: super::py_run::RunUpdateExtended,
) -> PyResult<()> {
let unpacked = slf.get();
Python::allow_threads(slf.py(), || unpacked.client.submit_run_update(run.into_inner()))
.map_err(|e| into_py_err(slf.py(), e))
}

pub fn drain(slf: &Bound<'_, Self>) -> PyResult<()> {
let unpacked = slf.get();
Python::allow_threads(slf.py(), || unpacked.client.drain())
Expand Down
121 changes: 90 additions & 31 deletions rust/crates/langsmith-pyo3/src/py_run.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,43 +46,43 @@ impl FromPyObject<'_> for RunCreateExtended {
}
}

fn extract_attachments(value: &Bound<'_, PyAny>) -> PyResult<Option<Vec<Attachment>>> {
if value.is_none() {
return Ok(None);
}

let mapping = value.downcast::<PyMapping>()?;
#[derive(Debug)]
pub struct RunUpdateExtended(langsmith_tracing_client::client::RunUpdateExtended);

let size = mapping.len()?;
if size == 0 {
return Ok(None);
impl RunUpdateExtended {
#[inline]
pub(crate) fn into_inner(self) -> langsmith_tracing_client::client::RunUpdateExtended {
self.0
}
}

let mut attachments = Vec::with_capacity(size);

for result in mapping.items()?.iter()? {
let key_value_pair = result?;

let key_item = key_value_pair.get_item(0)?;
let key = key_item.extract::<&str>()?;
impl FromPyObject<'_> for RunUpdateExtended {
fn extract_bound(value: &Bound<'_, PyAny>) -> PyResult<Self> {
let run_update = value.extract::<RunUpdate>()?.into_inner();

// TODO: attachments are WIP at the moment, ignore them here for now.
//
// let attachments = {
// if let Ok(attachments_value) = value.get_item(pyo3::intern!(value.py(), "attachments"))
// {
// extract_attachments(&attachments_value)?
// } else {
// None
// }
// };
let attachments = None;

// Each value in the attachments dict is a (mime_type, bytes) tuple.
let value = key_value_pair.get_item(1)?;
let value_tuple = value.downcast_exact::<PyTuple>()?;
let mime_type_value = value_tuple.get_item(0)?;
let bytes_value = value_tuple.get_item(1)?;
let io = RunIO {
inputs: serialize_optional_dict_value(value, pyo3::intern!(value.py(), "inputs"))?,
outputs: serialize_optional_dict_value(value, pyo3::intern!(value.py(), "outputs"))?,
};

attachments.push(Attachment {
// TODO: It's unclear whether the key in the attachments dict is
// the `filename`` or the `ref_name`, and where the other one is coming from.
ref_name: key.to_string(),
filename: key.to_string(),
data: bytes_value.extract()?,
content_type: mime_type_value.extract()?,
});
Ok(Self(langsmith_tracing_client::client::RunUpdateExtended {
run_update,
io,
attachments,
}))
}

Ok(Some(attachments))
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved this function down, to keep all the type definitions next to each other in the file. Otherwise it remained unchanged.


#[derive(Debug)]
Expand Down Expand Up @@ -137,6 +137,26 @@ impl FromPyObject<'_> for RunCreate {
}
}

#[derive(Debug)]
pub(crate) struct RunUpdate(langsmith_tracing_client::client::RunUpdate);

impl FromPyObject<'_> for RunUpdate {
fn extract_bound(value: &Bound<'_, PyAny>) -> PyResult<Self> {
let common = RunCommon::extract_bound(value)?.into_inner();

let end_time = extract_time_value(&value.get_item(pyo3::intern!(value.py(), "end_time"))?)?;

Ok(Self(langsmith_tracing_client::client::RunUpdate { common, end_time }))
}
}

impl RunUpdate {
#[inline]
pub(crate) fn into_inner(self) -> langsmith_tracing_client::client::RunUpdate {
self.0
}
}

#[derive(Debug)]
pub(crate) struct RunCommon(langsmith_tracing_client::client::RunCommon);

Expand Down Expand Up @@ -196,6 +216,45 @@ impl FromPyObject<'_> for RunCommon {
}
}

fn extract_attachments(value: &Bound<'_, PyAny>) -> PyResult<Option<Vec<Attachment>>> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is unchanged, just moved farther down in the same file.

if value.is_none() {
return Ok(None);
}

let mapping = value.downcast::<PyMapping>()?;

let size = mapping.len()?;
if size == 0 {
return Ok(None);
}

let mut attachments = Vec::with_capacity(size);

for result in mapping.items()?.iter()? {
let key_value_pair = result?;

let key_item = key_value_pair.get_item(0)?;
let key = key_item.extract::<&str>()?;

// Each value in the attachments dict is a (mime_type, bytes) tuple.
let value = key_value_pair.get_item(1)?;
let value_tuple = value.downcast_exact::<PyTuple>()?;
let mime_type_value = value_tuple.get_item(0)?;
let bytes_value = value_tuple.get_item(1)?;

attachments.push(Attachment {
// TODO: It's unclear whether the key in the attachments dict is
// the `filename`` or the `ref_name`, and where the other one is coming from.
ref_name: key.to_string(),
filename: key.to_string(),
Comment on lines +248 to +249
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the ref_name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attachments dict is name -> (mime_type, data), so is using the key for both ref_name and filename the right thing to do?

data: bytes_value.extract()?,
content_type: mime_type_value.extract()?,
});
}

Ok(Some(attachments))
}

/// Get an optional string from a Python `None`, string, or string-like object such as a UUID value.
fn extract_string_like_or_none(value: Option<&Bound<'_, PyAny>>) -> PyResult<Option<String>> {
match value {
Expand Down
Loading