SASPy Configuration with SAS Analytics Pro Cloud Native

woman programming on a notebook

SAS Analytics Pro Cloud Native brings the ability to run SAS within a containerized environment which brings exciting possibilities for CI/CD and integrating SAS into other applications. 

SASPy has become a popular choice for Data Scientists and integration developers to bring the power of SAS procedures and data step to Python software development chains.  This post seeks to outline the steps required to configure SAS Analytics Pro cloud native to accept SSH connections which are required by SASPy and augment the current documentation for using SASpy with SAS Analytics Pro

It is also important to note that the following steps can also be used to natively call SAS in STDIO mode from your host machine to the container to perform tasks.

Some Notes on Terminology

For people without a lot of experience in using Docker, SSH, Python or networking, the terminology in web articles can be a bit confusing and overwhelming.  The below table outlines the meanings of terms used in this article.  For further information as well on how SAS Analytics Pro works in Docker, please see our previous article which outlines Docker concepts and another article on the differences of SAS Analytics Pro and SAS9.

TermMeaning
AproRefers to SAS Analytics Pro Cloud Native which is a product offering from SAS for running a SAS Programming environment within Docker.
DockerDocker is a technology company that provides a runtime and development tools for interacting with Containers and Images.
ImageAn image is a set of compressed software libraries and binaries that can be executed as a container inside the Docker runtime environment. Images operate against a common OS kernel. SAS provide an Apro image which can be run as a container.
Containers Containers are an instance of a Docker image. Containers contain additional configuration information such as network settings, volume and port mappings.
HostYour local machine where you are starting the Apro container from.  This may be your laptop, PC, or a server.
SSHThis is a communication method for connecting from one machine to another in a network.  In this instance we are performing SSH connections between your host and the Apro container.
Key pairThese are a set of cryptographically generated keys used to identify and authenticate you when using passwordless connections over SSH.

About SASPY

SASPY is a Python package developed by SAS and open-source contributors.  It provides an interface to the SAS language including the submission of SAS code, procedures and data interaction.  It provides a number of connection methods depending on the type of SAS platform you want to connect to.  This includes:

  • IOM based connections for SAS9 / Metadata server platforms.
  • HTTP/S for Viya
  • STDIO over SSH for Linux based servers
  • STDIO for local connections on Linux where SAS is installed on the machine you are working from.

We will be using STDIO over SSH in this scenario.  The STDIO over SSH method enforces passwordless SSH connections so we will need to set this up.

Configuring Passwordless SSH

The first thing you need to set up passwordless SSH is a public and private key pair.  To generate, you need some software on your host.  On Linux based operating systems OpenSSH is already installed.  On Windows you may need to install it or if you use Git, the Git bash client has it installed already.

To check if you have an existing key pair, first check your %USERPROFILE%\.ssh directory on Windows or ~/.ssh directory on OSX/Linux.

cd ~
ls -al .ssh/

Or in Powershell.

cd %USERPROFILE%
ls .ssh/

If you see a group of files in the listing starting with id_xxxx and one with an extension of .pub and the other without, you already have a public / private key pair.  For example if you have an rsa encrypted key pair you would see two files:

id_rsa
id_rsa.pub

If you have existing keys, ideally they are configured without passphrases.  Passphrases are great for interactive usage as they add an additional layer of security but they hinder things when using keys in automation scripts.  For SASPY, keys without passphrases work best.

Creating a Key Pair

If you don’t want to use an existing key pair or do not have a set you can generate them using the following commands. 

ssh-keygen -t rsa -b 4096 -C "sasdemo"

Let’s break this down:

  • The command ssh-keygen creates the public and private key pair.
  • The -t rsa is telling ssh-keygen what type of key to generate. In this case it is the RSA encryption algorithm.
  • The -b 4096 is telling ssh-keygen the bitness to use in the algorithm
  • The -C “sasdemo” is a comment to help identify what the key is for.  It is appended to the key.

After hitting enter you will be prompted for a few values.  You just need to press enter for each one without adding anything different.  The exception is If you already have a key named id_rsa and you don’t want to overwrite it.  Specify a new name in the same file path it chooses (will default to $home/.ssh/<name>.  The below illustration shows this.  I have named my keypair as id_rsa_apro

Generating a Key Pair

Take note of the name of the key you generated as we will need this later.  Next we need to configure the Apro container to allow SSH connections.

Configuring SSH in APRO

Containers by default are generally built to be as lightweight as possible and as such, generally do not have all the same libraries and packages as a full operating system.  In fact, it is one of the 12 principals of docker image development

The Apro container will allow SSH without much configuration.  The SAS instructions for this are fairly clear and are transcribed below.

  • In your /sasinside directory, create a folder called sasosconfig and in that new folder place an empty file called sshd.conf
  • In your container startup definition you need to add some system capabilities with –CAP_ADD arguments.  These capabilities are a linux concept.  To read more about their specifics see this guide. The capabilities we are adding are:
    • --cap-add AUDIT_WRITE
    • --cap-add SYS_ADMIN
  • We also need to expose a port for ssh communication.  We will use the same port used by SAS in their example.  Add the following to your invocation command.  This is telling docker to expose port 22 in the container and forward that through to port 8222 on your host.
    • --publish 8222:22

Once you have done the following restart your container for the changes to take effect.  If you have followed the steps correctly, you should see your container running in your docker client.

Now we have to create a new directory and set some permissions in the apro container to let us copy your generated key from earlier.  From a command line:

  • Create the .ssh folder under the path you specified for the /data directory.
  • Run docker exec -u sasdemo sas-analytics-pro chmod -R 755 /data/.ssh to set the permission level for the folder.  SSH expects your ssh folder to have restrictive permissions. 755 is the most permissive allowed.

Next we want to copy your public key you created earlier into the newly created .ssh folder.

To do this, we can use the docker cp command. SSH looks for a file called authorized_keys which contains a list of public keys that the server will accept connections from:

docker cp %USERPROFILE%\.ssh\id_rsa_apro.pub sas-analytics-pro:/data/.ssh/authorized_keys

Now if all has gone successfully, we can now test our connection!

Testing the SSH connection

If you are on Windows, the docker container IP address may not be usable from outside of the container. On windows we simply need to use localhost or 127.0.0.1 for our server address.

Secondly, SSH by default enforces Strict Host Key Checking. You will receive an error in your attempted connection when the IP address of your Docker container changes which is whenever you restart it. To get around this, you can do one of the following techniques:

Under your .ssh folder you created earlier, add an additional file called config and add the following:

Host *
StrictHostKeyChecking no

This is quite permissive. It is telling SSH to ignore host key checking from every host that attempts to connect. To be more stringent and just limit to your SASPy connection you can place connection arguments in your SASPy sascfg_personal.py file which we cover later in this article.

Test via Command Line

To test we have SSH configured let’s see if we can get an interactive terminal to SAS working.

ssh -t -vvv -p 8222 -i %USERPROFILE%\.ssh\id_rsa_apro sasdemo@127.0.0.1 /opt/sas/viya/home/SASFoundation/sas -fullstimer -nodms -stdio -terminal -nosyntaxcheck -pagesize MAX

While this is a long command let’s break it down to see what is happening:

  • We are using the ssh program and forcing a pseudo terminal with the -t command.  This is useful when using interactive command line programs.
  • The -v flag is giving us verbose logging.  It’s always a good idea to use -v when testing so you can see additional information about what’s going on. Less v’s give less verbose logging detail.
  • The -p 8222 argument is telling ssh to connect on port 8222.  This is the port you specified in your Apro setup for ssh.  Replace the number with the one you used.
  • The -i argument is telling ssh to use following identity file.  This is only needed when your key pair does not use standard names.  This file is the private key part of the key pair.  You don’t need this if you accepted the defaults in the earlier step of generating the key pair.
  • Next we have the server address we are communicating to.  This is in the form of <username>@<server address>.  If you have started Apro with a different username, then replace sasdemo with the login name you use for SAS Studio.  For the second part we specify either localhost or 127.0.0.1.  The SAS documentation for this part is incorrect.  If you are on windows, the docker container IP address will not be usable from outside of the container.  You will need to use either localhost or 127.0.0.1.
  • The next part of the command is invocating SAS in stdio mode.  This is the command that SASPY generates when starting a connection.  If we successfully connect to SAS then we are almost assured that SASPY will also connect.

If all is successful, you will get an interactive window to SAS.

SAS STDIO

To exit, hit enter and type endsas;;;;  Alternatively, stop the tutorial here and join the ranks of SAS demi gods by using SAS in the original form!

SASPY Configuration

Once we have confirmed that SSH is working correctly to our container next we need to update our saspy configuration.  This article won’t go into the details of installing saspy as there are plenty of articles and detailed help in the saspy user documentation at saspy.readthedocs.io

Open your sascfg_personal.py file and add the following:

  • Add a new entry in the SAS_config_names variable list to label the new connection.  Here I will use ‘apro’
  • Next create a new dictionary variable with the same name you added to SAS_config_names.
apro = {
'saspath': '/opt/sas/viya/home/SASFoundation/sas_u8',
'ssh': 'ssh',
'host': 'localhost',
'port': 8222,
'luser': 'sasdemo',
'localhost': 'host.docker.internal',
'encoding': 'utf_8',
'dasho': 'StrictHostKeyChecking=no'
}

Replace the luser value with your SAS Studio login userid, the port you opened in your docker config and any additional SAS options in the options key (not shown).

This configuration is slightly different to a standard SSH STDIO connection with the addition of the localhost and dasho parameters. These are optional and depend largely on whether or not you are running SAS Analytics Pro via Docker Desktop on Windows or via the Docker runtime.

The upload and download methods in SASPy use the socket filename engine in SAS to transfer files between client and server. Docker adds an entry to the windows hosts file which directs your IPv4 address to host.docker.internal. This is then used by Docker to communicate to the external host when required. For a longer discussion on this please see the following Github issue between myself and SAS on the topic.
As mentioned previously, you may also receive errors from SSH when your container IP address changes as a result of restarting your container. The dasho parameter is passed to the ssh command to disable this host key checking for SASPy.

Once complete you can test your SASPY connection.

try:    
import saspy
except ImportError:
raise ImportError('saspy was not found in the specified path')
sas = saspy.SASsession(cfgname='apro')
res = sas.submit(code='%put NOTE: Success;',printto=True)
print(res.get('LOG'))

If you have your sascfg_personal.py file somewhere other than the default paths, add cfgfile=’/path/to/your/file.py’ to the SASsession method and cfgname='name' if not using default.

Check that you see NOTE: Success in the returned log and if so, you now have SASPY connected to SAS Analytics Pro!

Conclusion

Hopefully this tutorial has been a help to you and assists you in setting up SASPy with Analytics Pro.

We’ve also created the Selerity Desktop (Personal) tool to help make deployments easier if you are uncomfortable with the above concepts.  The Selerity Desktop configures SASPy connectivity for you. The personal edition allows you to deploy container environments with a series of additional options such as Python, Clinical Standards Toolkit and SAS OQ testing without needing to know any of the technical details. If you are interested in further information please see our product page for further information or reach out to us to discuss more complex deployments or licensing requirements.

Also stay tuned for future posts where we delve deeper into use cases that SAS Analytics Pro Cloud Native can support and other, more advanced, deployment options.

Cameron (Selerity)

Click Here to Leave a Comment Below

Leave a Comment:

Discover more from Selerity

Subscribe now to keep reading and get access to the full archive.

Continue reading