Using a f-strings to pass a python tuple as an varlist in the dsopts where statement #542
Replies: 5 comments 3 replies
-
Hey, I've been trying to reproduce this to see what's happening. I've found a number of things while trying to do so. I've not been able to get the failure you're getting, though I see in the code where that error would be coming from, though actual failure doesn't make sense, unless SAS crashed or something. I don't have anything like that happen. First, when I have a where clause as long as that I get the following error on the initial data step, though it doesn't fail, and the other steps continue to run:
I have to think that's the first part of the problem. I don't know what versions of SASPy and SAS you're running with, so maybe that's causing it to fail differently on your side. Can you submit your SASsession object and post that so I can see your versions?
The other thing I see is that when I have a valid length where clause for this, it doesn't return any results. I don't believe that's a valid statement but it doesn't cause an error which is strange. The doc for the where data set option only says it supports AND, AND NOT, OR, or OR NOT: https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/ledsoptsref/p0ny9o8t8hc5zen1qn3ft9dhtsxx.htm Out of curiosity, this can't be efficient, even if it works. Why not run the sql you did before and then do sd2df() on that result? |
Beta Was this translation helpful? Give feedback.
-
Cool, thanks, you're on current version of both! Well, this is strange. One way would be not to upload the whole df2, but simply make your list (variable_1) a df and upload it, then just join on that (it's a one row table with the values). That may be most efficient, but I'd have to play around with a real example to know for sure. Certainly multiple ways to skin that proverbial cat. Here's the code I came up with to try to simulate what I thought you were doing. Can you try it and see what you get, and see if it is equivalent to your scenario, of if I'm doing something wrong that's making it not the same?
|
Beta Was this translation helpful? Give feedback.
-
Well, I figured out why nothing I was doing was working as expected, nor making sense. For some reason sashelp .cars has a leading blank in the model variable (all of the rows). That shouldn't be, and that's why I wasn't getting any results, because none of the values I was comparing with had leading blanks and SAS is comparing byte for byte.
|
Beta Was this translation helpful? Give feedback.
-
So, I've fixed my original test case to use a copy of cars with a cleaned up model column. Now it works as expected, and for the case w/ too long a line, I still don't get a failure like your original post showed. But, what do you get when you run this code?
Here are my results:
|
Beta Was this translation helpful? Give feedback.
-
ok, so that behaves the same for you. I'm not sure what your original errors is from, for the case you were trying. It is strange that even with the error that the line being submitted (containing the where clause) is too long, the step continues to run. I'm not sure how it's being processed; if it is using the whole where clause or not. I wonder if something about that is causing more of a problem for your case. I don't know your data is or what your where clause for that is; I was just trying to come up with something similar, which was enough to see what was happening. Were you able to try your case with sending the where clause values over as a one column table then doing the join in SAS, like I showed? Wonder if that's faster than the in clause with all those literal values or not. It would also work with all the values, which cause the line to be too long; as there's nothing in the SAS code being submitted (listing those out) with the join case; just a simple proc sql. |
Beta Was this translation helpful? Give feedback.
-
Hi!
I work with multiple large dataset on server and a common approach to filtering is to do the following PROC SQL procedure in SAS
Now that i´m working in SASPy I have been trying to find a similar procedure by using SAS Data Step. By using python f-string I can fetch the values from a python variable and display them as an part of the string in the dsopts where statement by doing the following.
variable_1 = tuple(set(dataframe_2["column_1"].values.tolist()))
The procedure work well when the number of items in the variable_1 is low but i got one example where the length of the variable_1 is about 720 000. When i run the above mention example with a big variable_1 I get the follow error message
I have tried to recreate the problem in SAS EG but since there is a limit to the length of a macro variable I can´t. I also understand that the issue might be that SAS can´t create long macro variable but is that the problem or has it something to do with the Data Step procedure.
I´m using a iomlinux connection by the way.
One solution to this would be to get the data in smaller batches and then concatenate the DataFrame to get one object but is there any other solution I have been missing?
any help would be greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions