Tag Archives: Python

Why are there so many Python installers?

Those who have been following Python development on Windows recently will be aware that I’ve been actively redeveloping the installer. And if you’ve been watching closely you’ll know that there are now many more ways to install the official python.org release than in the past, not even including distributions such as WinPython or Anaconda.

In this post, I’m going to discuss each of the ways you can install the official releases of Python (since version 3.5), provide some context on when and why you would choose one over another, and discuss the positives and negatives of each approach.

History

Historically, there was one single MSI installer available that was intended to cover the needs of all Python users.

This installer would allow you to select a target directory and some features from its user interface or the command line (if you know the magic words), and would generally install the full distribution with all entry points (shortcuts, etc.).

Unfortunately, due to the nature of how MSIs work, there are some limitations that affect the user experience. The most major of these is the fact that MSIs cannot decide whether elevate as part of the install – it has to be hardcoded. As a result, the old installer always requires administrative privileges just in case you choose to install for all users. This prevents installation of Python on machines where you do not have full control over the system.

Secondly, while Python is often seen as one monolithic package, it is actually made up of a number of unrelated components. For example, the test suite is often not required for correct operation, nor is the documentation and often the development headers and libraries. While MSIs do support optional features, they tend to encounter issues when performing upgrades between versions (such as forgetting which options you had selected), and in general you always need to carry around the optional components even if you’re never going to install them.

Finally, some operations that are not simple file installations can be complicated. For example, when pip is installed or the standard library is precompiled, the MSI executes a background task rather than normally installing files. Without careful configuration of the MSI, these files will not be properly uninstalled or repaired, and issues in the extraction process can cause the entire task to fail. At worst, the uninstall step could fail, which can make it impossible to uninstall Python.

Modern Era

The issues described above have been addressed by the installers available since Python 3.5. However, there are also other uses for Python that do not lend themselves to a regular installer. For example, applications that want to include Python as a runtime dependency may not want to install a global copy of Python, build machines may require semi-public but non-conflicting installs of different versions, and platform-as-a-service web hosts may not allow normal installers.

Since Python 3.5.2, the official Python releases have been made available as executable installers, embeddable ZIP packages, nuget packages and Azure site extensions. There are also a range of third-party distributions that include the official Python binaries, along with other useful tools or libraries.

How do I know if a third-party distribution has the official binaries? Find the install directory, right-click each of the exe, dll or pyd files and select Properties, then Digital Signatures. If the signature is from the Python Software Foundation, it’s the official binary and has not been modified. If there is a different signature or no signature, it may not be the same as what is released on python.org.

Executable installers

The executable installers are the main way that users download Python, and are the featured downloads at python.org. I think of these as the Python Developer Kit.

These installers provide the most flexible user interface, include all dependencies such as system updates and the Python launcher, generate shortcuts for the interpreter, the manuals and the IDLE editor, and correctly support upgrades without forgetting about feature selection.

Two versions of the executable installer are available for any given release – one labelled “executable installer” and the other “web-based installer”.

The web-based installer is typically a small initial download (around 1MB), which gets you the installer UI shown above. After you have selected or deselected optional components, the minimum set of packages necessary to install Python will be downloaded and installed. This makes it easy to minimize overall download size since unused or unnecessary components are never downloaded, though it does require that you be connected to the internet at install time. (There’s also a command-line option to download all the packages you may ever need, which will then be used later instead of downloading them over and over again.)

The other installer includes the default set of features in the EXE itself. As a result, the initial download is around 30MB, but in most cases you can install without requiring any further internet access. For a single installation, your download will likely be 3-5MB larger compared to using the web-based installer, but if you use it to install on multiple machines then you’ll likely come out ahead.

Both executable installers result in identical installations and can be automated with identical command-line options. As I mentioned above, I think of this as the Python Developer Kit, which is why there are optional features to download debugging symbols or a complete debug build, which are not available in any other options. The Python Developer Kit provides everything necessary for someone to develop a complete Python application.

What about having a single MSI installer? There’s a section coming up about this. Just keep reading.

Embeddable package

If the executable installer is the Python Developer Kit, then the embeddable package is the Python runtime redistributable. Rather than trying to be an easy-to-use installer, this package is a simple ZIP file containing the bare minimum of Python required to run applications. This includes the python[w].exe executables, the python35.dll (or later) and python3.dll modules, the standard library extension modules (*.pyd), and a precompiled copy of the standard library stored in another ZIP file.

The resulting package is about 7MB to download and around 12MB when extracted. Documentation, tools, and shortcuts are not included, and the embeddable package does not reliably build and install packages. However, once your application is ready, rather than instructing users to install Python themselves, you can include the contents of this package in your own installer. (For example, Microsoft’s command-line tools for Azure will likely do this, and installers created using pynsist can include this package automatically.)

Using the embeddable package allows you to distribute applications on Windows that use Python as a runtime without exposing it to your users. By default, a configuration file is also included to force the use of isolated mode and prevents environment variables and registry settings from affecting it (python36._pth on Python 3.6; pyvenv.cfg for Python 3.5). On Python 3.6 this file can also specify additional search paths. If your application is hosting Python, you can also choose not to distribute python.exe or any extension modules that are not used in your application.

There is no support for pip, setuptools or distutils in the embeddable package, since the idea is that you will develop against the Python Developer Kit and then lock your dependencies when you release your application. Depending on the installer technology you are using for your application, you will probably vendor any third-party packages by copying them directly into the directory with your Python code.

See this blog post for more information about how to take advantage of the embeddable distribution.

Nuget package

Nuget is a packaging technology typically used on Windows to manage development dependencies. There are many packages available as source code or pre-built binaries, mostly for .NET assemblies, as well as build tools and extensions.

There are four Python packages available on nuget, released under my name (steve.dower) but built as part of the official python.org releases. The packages are:

These may be referenced by projects in Visual Studio or directly using nuget.exe to easily install a copy of Python into a build directory. It will typically install into a directory like packages\python.3.5.2\tools\python.exe, though this can often be customised.

rem Install Python 2.7
nuget.exe install -OutputDirectory packages python2

rem Add -Prerelease to get Python 3.6
nuget.exe install -OutputDirectory packages -Prerelease python

rem More options are available
nuget.exe install -Help

The contents of the nuget package is somewhere between the full installation and the embeddable package. The headers, libs and pip are included so that you can install dependencies or build your own modules. The standard library is not zipped, but also does not include the CPython test suite or libraries intended for user interaction. Operating system updates are not included, so you will need to ensure your build machine is up to date before using these packages.

There is no configuration in these packages to restrict search paths or environment variables, as these are very important to control in build definitions. As a result, there is a high likelihood that a regular installation of Python may conflict with these packages. In general, it’s best to avoid installing Python on build machines where you are using these packages. If you need a full installation, avoid using the nuget packages or test for conflicts thoroughly. (Note that conflicts typically only occur within the same x.y version, so you can safely install 2.7 and use the 3.5 nuget packages.)

Azure Site Extensions

Note: This particular package is released by Microsoft and is managed by my team there. The Python Software Foundation is not responsible for this package.

Azure App Service is a platform-as-a-service offering for web services (including web apps, mobile backends, and triggered jobs). It uses site extensions to customise and enhance your web services, including a range of Python versions to simplify configuration of Python-based servers.

Because web services are sensitive to even the smallest change in a dependency, each version is available as its own package. This allows you to be confident that when your site uses one of these it is not going to change without you explicitly updating your site. The current packages available at time of writing are:

The contents of these packages is almost entirely unmodified from the official python.org releases. Some extra files for correct installation, configuration and behaviour of the web server are included, as well as copies of pip, setuptools, and certifi. Occasionally a package will include targeted patches to fix or work around issues with the platform, but we always aim to upstream fixes as soon as possible. Under the hood, these are simply nuget packages that can also be installed using nuget.exe on any copy of Windows.

C:\> nuget.exe list python -Source https://www.siteextensions.net/api/v2/
python2711x64 2.7.11.3
python2712x64 2.7.12.2
python2712x86 2.7.12.1
python351x64 3.5.1.6
python352x64 3.5.2.2
python352x86 3.5.2.1

Visit aka.ms/PythonOnAppService for the most up-to-date information about how to use these packages on Azure App Service.

Hypothetical Futures

While that covers the current set of available installers, there are some further use-cases that are not as well served. In this section I will briefly discuss the cases that I am currently aware of and their status. There are no promises that official installation packages for these will ever be produced (bearing in mind that Python is developed almost entirely by volunteers with limited free time), but there is also nothing preventing third-parties from producing and distributing these formats.

Are you already distributing Python in any of these formats? Let me know and I’m happy to link to you, provided I’m not concerned about the contents of your distribution.

Nuget package for source/runtime dependency

Earlier I discussed the nuget packages as build tools, but the more common use of nuget packages is for build dependencies. Normally a project (typically a Visual Studio project, but nuget can also be used independently) will specify a dependency on a source or binary package and obtain build steps or configuration from a known location within the package.

Providing a nuget package containing either the Python source code or the embeddable package may simplify projects that host the runtime. These would predominantly be C/C++ projects rather than pure Python projects, but some installer toolkits may prefer a ready-to-embed nuget package rather than a plain ZIP file.

There has not been much demand for this particular format. In general, a C/C++ project can make equally good use of the current nuget packages, and would require those for the headers and libraries anyway, while the embeddable package is not always suitable for installation completely unmodified. These reduce the value of a dependency nuget package to nearly zero, which is why we currently don’t have one.

Universal Windows Platform

The Universal Windows Platform is part of Windows 10 and specifies a common API set that is available across all Windows devices. This includes PCs, tablets, phone, IoT Core, XBox, HoloLens, and likely any new Windows hardware into the future.

Providing a UWP package of Python would allow developers to distribute Python code across all of these platforms. Indeed, the team behind IoT Core have already provided their version of this package. However, as the API set is not always compatible with the Win32 API, this task requires supporting a new platform within Python (that is, sys.platform would return a value other than 'win32'). Currently nobody has completely adapted Python for UWP, added the extensions required to access new platform APIs, or fully implemented the deployment tools needed for this to be generally useful (though the IoT Core support is a huge step towards this).

Administrative Deployment

System administrators will often deploy software to some or all machines on their network using management tools such as Group Policy or System Center. While it is possible to remotely install from the executable installers, these tools often require or have enhanced functionality when the installer is a pure MSI.

Unfortunately, the issues and limitations of MSI described at the start of this post still apply. It is not possible for an MSI to install all required dependencies, create an MSI that can run without administrator privileges, and robustly ensure that upgrade and remove operations behave correctly. However, it would be possible to produce a suitable MSI and installation instructions for the limited use case of administrative deployment. Such a package would likely have these characteristics:

  • Fails if certain operating system updates are missing
  • Always requires administrator privileges
  • Only allows installation for all users
  • Only allows configuration at the command line (via msiexec)
  • Requires a separate task to precompile the standard library and install pip
  • Requires additional cleanup task when uninstalling
  • Prevents the executable installer from installing for all users

System administrators would be responsible for following the documentation associated with such an MSI, and I have no doubt that most are entirely capable of doing this. However, as this would not be a good experience for most users it cannot be the default or recommended installer. I’m aware that there are some people who are grieved by this, but interactive installs are vastly more common and so take priority when determining what to offer from python.org.

Summary

Installing Python on Windows has always been a fairly reliable process. The ability to select precisely which version you would like without fear of damaging system components allows a lot of confidence that is not always available on other platforms. Improvements in recent releases make it easier to install, upgrade and manage Python, even for non-administrator users.

We have a number of different formats in which Python may be obtained depending on your intended use. The executable installers provide the full Python Developer Kit; the embeddable package contains the runtime dependencies; nuget packages allow easy use of Python as a build tool; and site extensions for Azure App Service make it easier to manage Python as a web server dependency.

There is also potential to add new formats in the future, either through third-party distributions or as new maintainers volunteer their time towards core development. For an open-source project that is run almost entirely on volunteer time, Python is an amazing example of a robust, trustworthy product with as much flexibility as any professionally developed product.

Discussion of this post is welcome here in the comments. If you are having issues installing Python, please file an issue on bugs.python.org.

Building extensions for Python 3.5 Part Two

This was never meant to become a series, but unfortunately after posting what is now part one, we found a serious regression that led us to release Python 3.5.0rc4.

If you haven’t already, I recommend going back and reading part one up until the point where it says to come back here. That will fill you in on the background.

This is the point where I ought to make the “seriously, I’ll wait” joke, knowing full well that nobody is going to read the old post. Instead, I’m going to write one sentence, and if you understand the words in it, you have enough background to continue reading.

MSVC 14.0 (and later) and the UCRT give us independence from compiler versions, provided you build with /MT and force ucrtbase.dll to be dynamically linked.

If you scratched your head about any part of that, go and read part one. You’ll thank me in about two seconds.

The /MT Problem

Previously, I discussed some of the problems that arise when compiling with the /MT option. Mostly, because the option needs to be specified throughout all your code, it was going to cause issues with static libraries that had already been built.

Well, we found one more problem.


I want to stop for a second to thank Christoph Golhke for all his help through the 3.5.0 RCs.

For those who don’t know, Christoph maintains an epic collection of wheels for Windows. Every time you have trouble installing a package with pip, it’s worth visiting his site to see if he has a wheel available for it. After downloading the wheel, you can pass its path to pip install and you’ll get your package.

As we kept making changes to the build process, Christoph kept updating his own build steps and testing with literally hundreds of packages. His feedback has directly led to these changes to Python 3.5 that will make it much easier for everyone to have solid, future-proof builds of Python and extension modules.


Ready for the gory technical details of this new issue? Here we go.

Previously, we were statically linking functions from vcruntime140.dll into each extension module built for Python 3.5. This included, among other things, the DLL initialisation routines.

On the positive side, each extension module is now completely isolated from any others with respect to initialization, locale, debugging handlers, and other state.

On the negative side, it turns out initialization is a limited resource.

The CRT has a number of features that are thread-safe, such as errno. Each thread has its own errno, which means you do not need to perform locking around every operation that may set or use it.

To implement this per-thread state, the CRT uses fiber-local storage. On Windows, fibers are a form of cooperative multithreading that work within threads (one process may contain many threads, one thread may contain many fibers), so the CRT uses fiber-local storage to ensure the finest-grain handling. If it used thread-local storage, different fibers would see the same values, and if it used a normal global, all threads would see the same value.

Fiber-local storage slots are allocated using FlsAlloc(). If you read that doc carefully, you’ll see that a potential return value (error) is FLS_OUT_OF_INDEXES, which means you have exhausted the current process’s supply of fiber-local storage.

The CRT doesn’t like it if you’ve run out of fiber-local storage, because that means a lot of its stateful functionality will be broken. So it aborts.

How many slots do we get? It isn’t documented (so it could change at any time), but here’s how we can test it:


>>> import ctypes
>>> FlsAlloc = ctypes.windll.kernel32.FlsAlloc
>>> list(iter(lambda: FlsAlloc(None), -1))
[7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127]
>>> len(_)
121

On my machine, I can call FlsAlloc 121 times after starting the interpreter before it runs out of indices.

When each extension module has its own isolated initialization routine, each one will call FlsAlloc. Which means once I’ve loaded 121 extension modules, the 122nd will fail.

That sucks.

Going back to /MD

The only reasonable fix for this is to switch back to building both Python and extensions using /MD. So we’ve done that.

That means that vcruntime140.dll is now a runtime dependency and must be included with Python 3.5. So we’ve done that.

That means that every extension module for Python 3.5 has to depend on vcruntime140.dll and therefore must always be build with MSVC 14.0.

Well, I don’t think so.

Initialization is no longer isolated. You can load as many extension modules as you like and they’ll all use the same fiber-local storage index because vcruntime140.dll is already initialized.

But what happens when someone comes along with MSVC 15.0 (note: not a real version number) and builds an extension that depends on vcruntime150.dll? When they publish a wheel of that extension and someone installs in on Python 3.5.0? That version of Python only includes vcruntime140.dll, so there will be an unresolved dependency and the extension will fail to load.

We’ve also made a change to fix this. The hint was in that last sentence: the extension has an unresolved dependency.

Extension Dependencies

It is no longer (and technically never was) Python’s responsibility to provide dependencies required by extension modules. If you release an extension module, you are responsible for including any dependencies, or instructing your users on how to get them.

But simultaneously, we really don’t want to break python setup.py bdist_wheel upload (despite upload being insecure and you should be using twine or another uploader). If someone has MSVC 15.0 and builds a wheel using that version, somehow their users need to get vcruntime150.dll onto their machines.

So we fixed distutils.

When you build an extension using distutils (which is how most extensions are built, even if you think you’re using another package), we know which compiler version is being used. This means we know which version of the CRT you are using and which version of vcruntime###.dll your extension depends on.

We also know which versions of vcruntime###.dll shipped with the target version of Python. Currently, this is a set that contains only vcruntime140.dll (it’s also an internal implementation detail, so don’t start depending on it), and for the lifetime of Python 3.5 it will only contain that value.

When you build an extension with MSVC, we find the redistributable vcruntime###.dll in your compiler install and if it is not in the set of included files, it is copied into your build.

Currently it’s hard to see this in action, but once we have a newer MSVC to play with you’ll be able to see it. Whenever you build an extension module, it will get the new vcruntime alongside it.

Because of how DLL loading works, if that version of the DLL has not been loaded yet then the one adjacent to the extension module will be used. If it has been loaded, the one that is currently in memory will be used.

So really, we only need to include it once. The trick is making sure that it is loaded first. Ultimately, the only reliable way to do this is to include it everywhere.

Python 3.5 is always going to include vcruntime140.dll at this stage (though it may move from the top-level directory into DLLs at some point), so extensions built with MSVC 14.0 will always have it available.

Extensions built with a newer version of the compiler will include their own dependencies. As they should.

Perhaps Python 3.6 will be built with a newer version of the compiler. We can still include vcruntime140.dll, so that extensions can be built with MSVC 14.0, as well as including the newer version. We simply add the new one to the list in distutils and builds stop including it.

In this way, we can ensure that, at least for simple extensions (the majority), the easiest path will remain compiler-version independent.

Overriding the build process

One of the neat things about making the fix in the build process is that there are ways to override it. We can also make changes and improvements in bugfix releases if needed. It’s also fairly easy for developers to patch their own system if they need to customize things further. However, a few simple customisations are available without having to touch any code.

There are three stages in the build process with respect to copying vcruntime (henceforth, “the DLL”).

The first stage is locating the redistributable DLL. For regular installs of MSVC, these are in a known location, so once we’ve found vcvarsall.bat we find the DLL relative to that path.

The second stage is updating build options. If we didn’t find the redistributable DLL, we will statically link it exactly as described previously. This is not recommended, but it is still allowed and supported (and there is an opt-in described below).

The third stage is copying the DLL to your build directory.

Here are a few tricks you can use to customize your build. These are current as of Python 3.5.0rc4, but could change.

Use a different DLL

To specify your own path to the redistributable DLL, use a complete VS environment (such as the developer command prompt), set the environment variable DISTUTILS_USE_SDK=1, and then set environment variable PY_VCRUNTIME_REDIST to the full path of the DLL to use.


C:\package> "C:\Program Files\Microsoft Visual Studio 14.0\VC\vcvarsall.bat"

C:\package> set DISTUTILS_USE_SDK=1

C:\package> set PY_VCRUNTIME_REDIST=C:\Path\To\My\vcruntime140.dll

C:\package> python setup.py bdist_wheel

Note that this doesn’t override any of the following steps, so if your version of Python already includes vcruntime140.dll, then it won’t be copied. However, there’s no reason you can’t specify a different name here, and then it will be copied (but then you have to deal with a different name file, and might be missing actual dependencies… not really a great idea, but you can do it).

Always statically link

The build options are updated to statically link the DLL if PY_VCRUNTIME_REDIST is empty.

If you set DISTUTILS_USE_SDK but not PY_VCRUNTIME_REDIST, you will get statically linked vcruntime140.dll. This is probably the biggest surprise from the change.

Always dynamically link, but don’t copy

Probably the most useful customisation, you can choose to dynamically link the DLL but not copy it to your build output.

Since we check the name already, any file named vcruntime140.dll will never be copied anyway. However, with a future compiler you may already know that you are distributing the correct files and do not need distutils to help.

First, you apply the opposite of the previous customization: build options are updated to dynamically link the DLL if PY_VCRUNTIME_REDIST is not empty.

However, the DLL is only copied to the output directory if os.path.isfile is True.

The tests are deliberately different. It means you can force dynamic linking without the copy by setting PY_VCRUNTIME_REDIST to anything other than a valid file:


C:\package> "C:\Program Files\Microsoft Visual Studio 14.0\VC\vcvarsall.bat"

C:\package> set DISTUTILS_USE_SDK=1

C:\package> set PY_VCRUNTIME_REDIST=No thanks

C:\package> python setup.py bdist_wheel

No customisation

And of course, if you don’t need to customise anything, you can just call setup.py directly and distutils will automatically use the latest available compiler.

Summary

It’s been a bumpy ride, especially over the last week, but we’re on track to release Python 3.5.0 in a good state.

Using MSVC 14.0 and the UCRT, we have independence from the compiler version.

When a new version of MSVC is available, extensions built for Python versions that may not include the default dependencies will bundle them in their own package. (There are still some uninstallation concerns here to resolve, but we have time for those.)

In 10 years time it should still be possible to build extensions that work with Python 3.5.0 using the latest tools available at that time. Everyone who builds for Python 2.7 today should be excited by that prospect.

Thank you all again for everyone who has contributed testing, feedback, fixes and suggestions. We hope you will love Python 3.5.

Building extensions for Python 3.5

Some parts of this post have been superseded. If you are interested in the background, continue reading (I have marked the parts that are now incorrect). If you simply want to see how Python 3.5 and later are avoiding CRT compatibility issues, you can jump straight to part two.


As you read through the changelog or What’s New for Python 3.5, you may notice the following item:

Windows builds now use Microsoft Visual C++ 14.0, and extension modules should use the same.

For most Python users, this will not (and should not) mean anything. It doesn’t affect how you use or interact with Python, it isn’t new syntax or a new library, and generally you won’t notice any difference other than a small performance improvement (yay!).

However, Python extenders and embedders care a lot about this change, because it directly affects how they build their code. Since you are almost certainly using their work (numpy, anyone? Blender? MATLAB?) you should really hope that they care. However, this is a change, and no matter how good the end result is, change takes time. Please be patient with project maintainers, as they will have to spend more time supporting Python 3.5 than previous versions.

So while the most obvious benefit for most people may be a performance improvement, we haven’t even bothered benchmarking this precisely. Why not? Because the long-term benefits of the change are so good it would be worth sacrificing performance to get them. And we know it’s going to hurt some project maintainers in the short term, and again, the long-term benefits for the entire ecosystem – and those same maintainers – are worth it.

As I’m largely responsible for the compiler change (with the full support of the CPython developers, of course), this post is my attempt to help our ecosystem catch up and set the context so everyone can see just how the benefits are worth the pain. Python 3.5.0rc2 is already available with all of these changes, so project maintainers can be testing their builds now.

First, some definitions

While this post is intended for advanced audiences who probably know all of these terms, I’ll set out some definitions along the way just to make sure we’re all talking about the same thing.

MSVC is Microsoft Visual C++, the compiler used to build CPython on Windows. It’s often specified with a version number, and in this post I’ll refer to MSVC 9.0, MSVC 10.0 and MSVC 14.0.

CRT refers to the C Runtime library, which for MSVC is provided and supported by Microsoft and contains all of the standard functions your C programs can call. This is a heavily overloaded term, so I’ll be more specific and refer to DLL names (like msvcr90.dll) or import library names (like libucrt.lib) where it matters.

MSVCRT refers specifically to the CRT required by MSVC. Other compilers like gcc have their own CRT, typically known as libc, and even MSVCRT is made up of parts with their own distinct names.

VCRedist will come up later, and it refers to the redistributable package provided by Microsoft that installs the CRT, as well as extra files required for C++ programs and Microsoft extensions such as C++ AMP and the Concurrency Runtime.

MSVCRT’s Little Problem

The problem is rooted in a design decision made many years ago as part of MSVC (I don’t even know which version – probably the very earliest). While we can view it differently today, at the time it was clearly a good design. However, the long-term ramifications were not obvious without the rise of the internet.

Each version of MSVC (the compiler) comes with a matched version of the CRT (the library), and the compiler has intimate knowledge of the library. This allows for some cool optimizations, like choosing different implementations of (say) memcpy automatically based on what the compiler knows about the variables involved – if it can prove the ranges never overlap, why bother checking for overlap at runtime?

However, it does mean that when you use a different compiler, you also have to use the matched CRT version or everything breaks down very quickly. Generally this is okay, since when a developer upgrades to a newer compiler they can rebuild all of their code. The reason the internet causes this to break down is the rise of plugins and the ease of updates.

Many applications support plugins that are loadable shared libraries (DLLs, often with a special extension such as .pyd). While the application may not consider or describe these as plugins – Python prefers “extensions” or “native modules” – it is still a plugin architecture. And with the internet, we have easier access than ever to download and install many such plugins, and also to update the host application.

The CRT comes into play because it is shared between the host application and every plugin. Or rather, it assumes that it is shared. Because of the way Windows loads DLLs, if the host application and all its plugins are built with the same MSVC version and hence use the same CRT version, the state kept within that CRT would be shared.

Shared state includes things such as file descriptors, standard input/output buffering, locale, memory allocators and more. These features can be used equally by the host and its plugins without conflicts resulting in data corruption or crashes.

However, when a plugin is built with a different CRT, this state is no longer shared. File descriptors opened by the plugin do not exist (or worse, refer to a different file) in the host, file and console buffering gets confused, error handling is no longer synchronised, memory allocated in one cannot be freed in the other and so on. It is possible to safely use a plugin built with a different CRT, but it takes care. A lot of care.

This is the situation that Python 2.7 currently suffers from, and will continue to suffer from until it is completely retired. Python 2.7 is built with MSVC 9.0, and because of compatibility requirements, will always be built with MSVC 9.0 – otherwise a minor upgrade would break all of your extensions simultaneously, including the ones that nobody is able to build anymore.

Unfortunately, MSVC 9.0 is no longer supported by Microsoft and all the free downloads were removed, making it essentially impossible to build extensions for Python 2.7. The easiest mitigation was to keep making the compilers available in an unsupported manner, so we did that, but it still leaves projects in a place where they are using old tools, likely with unpatched bugs and vulnerabilities. Not ideal.

Python 3.3 and 3.4 were built with MSVC 10.0, which is in essentially the same position. The compiler is no longer supported and the tools are no longer easily available. Building extensions with later versions of MSVC results in CRT conflicts, and building with the older tools misses out on security fixes and other improvements.

One example of an improvement in MSVC 14 that is not in MSVC 10 or earlier is support for the C99 standard. I’m not claiming it’s 100% supported (it’s not), but even 90% support is much more useful than what was previously available.

The best mitigation we have for MSVC 10.0 builds of Python is to migrate to Python 3.5. Luckily, doing so does not require the same porting effort as moving from Python 2.7 would require, but it raises the question: why is Python 3.5 any better?

The answer is: UCRT.

The UCRT Solution

As part of Visual Studio 2015, MSVCRT was significantly refactored. Rather than being a single msvcr140.dll file, as would be expected based on previous versions, it is now separated into a few separate DLLs.

The most exciting one of these is ucrtbase.dll. Look carefully – there is no version number in the filename! This DLL contains the bulk of the C Runtime and is not tied to a particular compiler version, so plugins that reference ucrtbase.dll will share all the state we discussed above, even if they were built with different compilers.

Another great benefit is that ucrtbase.dll is an operating system component, installed by default on Windows 10 and coming as a recommended update for earlier versions of Windows. This means that soon every Windows PC will include the CRT and we will not need to distribute it ourselves (though the Python 3.5 installer will install the update if necessary).

It’s very important to clarify here that the compatibility guarantees only hold when linked through ucrt.lib. The public exports of ucrtbase.dll may change at any time, but linking through ucrt.lib uses API Sets to provide cross-version compatibility. Using the exports of ucrtbase.dll directly is not supported.

So the major issue faced by earlier versions of Python no longer exist. The next version of MSVC will be able to build extensions for Python 3.5, and it may even be possible for later version of Python 3.5 to be built with newer compilers without affecting users. But while this is the start of the story, it isn’t the end and the rest is not so pretty.

The UCRT Problems

While ucrt.lib is a great improvement over earlier versions, if you followed the link above or just read my comment carefully, you’ll see the rest of the problem. Besides ucrtbase.dll, there are other libraries we need to link with.

For pure C applications, the other DLL we need is vcruntime140.dll. Notice how this one includes a version number? Yeah, it depends on the version of the compiler that was used. Applications using C++ will likely depend on msvcp140.dll, which is also versioned. We have not yet completely escaped DLL hell.

Why weren’t these libraries also made version independent? Unfortunately, there are places where the compiler still needs intimate knowledge of the CRT. They are very few, and vcruntime140.dll in particular exports almost no functions that are both documented and have no preferred alternative in ucrtbase.dll (for example, memcpy may be used from vcruntime140.dll, but memcpy_s from ucrtbase.dll should be preferred). However, much of the critical startup code is part of vcruntime140.dll, and this is so closely tied to what the compiler generates that it cannot reasonably be made compatible across versions.

Ultimately, depending on any version-specific DLL takes us right back to the earlier issues. Extensions for Python 3.5 will need to use MSVC 14.0 or else include the version-specific DLLs – Python 3.5 could include vcruntime140.dll, but if an extension depends on vcruntime150.dll then it is not easily distributable.

Luckily, this concern was raised as the UCRT was being developed, and so there is a semi-official solution for this that happens to work well for Python’s needs.


The End (of part one)

Remember how I said at the start that some of this blog is no longer valid for Python 3.5? Yeah, that’s from here to the end. To see what we’ve actually done, stop reading here and read part two instead.


The Partially-Static Solution

To avoid having a runtime dependency on vcruntime140.dll, it is possible to statically link just that part of the CRT. Effectively, the required functions, which tend to be a very small subset of the complete DLL, are compiled into the final binary. However, the functions from ucrtbase.dll are still loaded from the DLL on the user’s machine, so many of the issues associated with static-linking are avoided.

There are many downsides to static linking, especially of the core runtime, ranging from larger binaries through to not automatically receiving security updates from the operating system. Previously, applications including Python have avoided static linking by distributing the CRT as part of the application (“app local”), but while this avoids some of the bloat concerns, the application distributor is still responsible for providing updates to the CRT. Statically linking vcruntime140.dll also leaves responsibility with the distributor for some updates, but significantly fewer.

Warning: This is where things get technical. Skip to the next section if you just want to know what you’ll need to fix.

The difference between dynamic linking and static linking is based on a few options passed to both the compiler (cl.exe) and the linker (link.exe). Most people are familiar with the compiler option, one of /MD (dynamic link), /MDd (debug dynamic link), /MT (static link) and /MTd (debug static link). As well as automatically filling out the remaining settings, these also control some code generation at compile time – different code needs to be compiled for static linking versus dynamic linking, and this is how that option is selected at compile time.

For the linker, there are separate libraries to link with. If the compiler option is provided, these are selected automatically, but can be overridden with the /nodefaultlib option. This table is adapted from the VC Blog post I linked above:

Release DLLs   (/MD):
    msvcrt.lib   vcruntime.lib      ucrt.lib
Debug DLLs     (/MDd):
    msvcrtd.lib  vcruntimed.lib     ucrtd.lib
Release Static (/MT):
    libcmt.lib   libvcruntime.lib   libucrt.lib
Debug Static   (/MTd):
    libcmtd.lib  libvcruntimed.lib  libucrtd.lib

I will ignore the debug options for the rest of this post, as debug builds should generally not be redistributed and can therefore reliably assume all the DLLs they need are available. This is why the Python 3.5 debug binaries option requires Visual Studio 2015 – to make sure you have the debug DLLs.

For a fully dynamic release build, we’ve built with /MD. This enables codepaths in the CRT header files that decorate CRT functions with declspec(dllimport) and so code is generated for calls to go through an import stub. Linking in vcruntime.lib and ucrt.lib provides the stubs that will be corrected at load time to refer to the actual DLLs.

For a fully static build, we use /MT which omits the declspec‘s and generates normal extern definitions. Linking with libvcruntime.lib and libucrt.lib provides the actual function implementation and the linker resolves the symbols normally, just as if you were calling your own function in a separate .c file.

What we want to achieve is linking with libvcruntime.lib for the static definitions, but ucrt.lib for the import stubs. Unfortunately, the compiler does not know how to generate code for this case, so it will either assume import stubs for all functions, or none of them, which results in linker errors later on.

There is one case that works: if we compile with /MT so the CRT will be statically linked, the generated code assumes everything can be resolved through it’s regular name. When linking, if we then substitute ucrt.lib instead of libucrt.lib, the linker can generate the import stubs needed to call into the DLL.

The build commands look like this:

cl.exe /MT /GL file.c
link.exe /LTCG /NODEFAULTLIB:libucrt.lib ucrt.lib file.obj

We use /MT to select the static CRT. The /GL and /LTCG options enable link-time code generation, and the /NODEFAULTLIB:libucrt.lib ucrt.lib arguments ignore the static library and replace it with the import library. The linker then generates the code needed for this to work, and we end up with a DLL or an executable that only depends on ucrtbase.dll (via the API sets).

Unfortunately, there are some follow-on effects because of this change.

What else does this break?

In case you skipped the last warning, the rest of this post is now invalid. To see what we’ve actually done, stop reading here and read part two instead.


With Python 3.5, distutils has been updated to build extensions in a portable manner by default. Most simple extensions will build fine, and your wheels can be distributed to and will work on any machine with Python 3.5 installed. However, in some cases, your extension may fail to build, may produce a significantly different .pyd file from previously, or may need extra dependencies when distributed.

Static Libraries

The first likely problem is linking static libraries. Because of the compiler change, you will probably need to rebuild other static libraries anyway, and it is important that when you do you select the static CRT option (/MT). As discussed above, we don’t actually link the entire CRT statically, but if your library expects to dynamically load the CRT DLL then it will fail to link.

If your library requires C++, your resulting .pyd will statically link any parts of the C++ runtime, and so it may be significantly larger than the same extension for Python 3.4. This is unfortunate, but not a critical issue, and it actually has the benefit that your extension will not be interfered with by other extensions that also use C++.

Of course, in some cases you really do not want to do this. In that case, I would strongly discourage you from uploading your wheels to PyPI, since you will also need to get your users to install the VCRedist version that matches the compiler you used. Currently, there is no way to check or enforce this through tools like pip.

Since it is so strongly discouraged, I’m not even going to show you how to do it, though I’ll give basic directions. In your setup.py file, you’ll want to monkeypatch distutils._msvccompiler.MSVCCompiler.initialize() (yes, the underscore in _msvccompiler means this is not supported and we may break it at any point), call the original implementation and then replace the '/MT' element in self.compile_options with '/MD'.

Ugly? Yep. By going down this path, you are making it near impossible for non-administrative users to use your extension. Please be very certain your users will be okay with this.

Dynamic Libraries

If you have binary dependencies that you can’t recompile but have to include, then your best option is to include the redistributable DLLs alongside them. Test thoroughly for CRT incompatibilities, especially if your dependencies use a different version of MSVCRT, and generally assume that only ucrtbase.dll will be available on your user’s machines. Dependency Walker is an amazing tool for checking binary dependencies.

Incompatible Code

The third likely issue that will be faced is code that no longer compiles. There has been an entire deprecation cycle between MSVC 10.0 and MSVC 14.0, which means some functions may simply disappear without warning (because the warning was in MSVC 11.0 and MSVC 12.0, which Python never used). There have also been changes to unsupported names and a number of non-standard names are now indicated correctly with a leading underscore.

Also, as with every release, the graph of header files may have changed, and so names that were implicitly #included previously may now require the correct header file to be specified. (This is not necessarily names moving into different header files, rather, one header file may have included another and the name was available that way. Dependencies within header files are not guaranteed stable – you should always include all headers directly when you require their definitions.)

A lot of code tries to fill gaps in various compilers and runtimes by defining functions under #ifdef directives. With the range of changes that have occurred, most of these should be checked and updated – _MSC_VER is defined as at least 1900 now, and because of the switch to /MT some defines of CRT exports may need to have the declspec(dllimport) removed (or remove the entire declaration and use the official headers).

MinGW

Finally, extensions that are built with gcc under MinGW are likely to have compatibility issues for some time yet, since the UCRT is not a supported target for those tools. Again, this pain is unfortunate, but long term it should be entirely feasible for the MinGW toolchain to support the Universal CRT better than the MSVC 10.0 CRT.

In Summary

If you made it this far, technically this part is still mostly correct. But then, it doesn’t really add anything new. To see what we’ve actually done, read part two.


By moving Python 3.5 to use MSVC 14.0 and the Universal CRT, we have (hopefully) removed the restriction to build extensions with matched compilers.

Extensions built with distutils can be distributed easily, though they may be larger or have more build errors as a result of different build settings.

Long term, we believe this change will avoid the problems currently faced by those building for Python 2.7 as toolchains are deprecated and retired.

The short-term pain we are going to experience would have occurred for any compiler change, but after this we should be largely insulated against the next.

Finally, please show some respect and grace towards the maintainers of projects you depend upon. It may take some time to see fully compatible releases for Python 3.5, and shouting at or abusing people online is not necessary or even helpful.

I personally want to thank everyone who distributes builds of their packages for Windows, which they don’t strictly need to do, and I apologise for the pain of transition. This change is meant to help you all, not to hurt you, though the benefits won’t be seen for some time. Thank you for your work, and for making the Python ecosystem exist for millions of users.

What’s coming for the Python 3.5 installer?

Last year at PyCon US, I volunteered to take over maintenance and development of the Python installers for Windows. My stated plan was to keep building the installer for Python 2.7 without modification, and to develop a new installer for Python 3.5. In this post, I’m going to show some of the changes I’ve been working on. Nothing is settled yet, and more changes are basically guaranteed before the first releases occur, but I’m happy that we’ll soon have a more powerful and flexible installer.

The installer will first be available for Python 3.5.0 alpha 1, due to be released in February.

Changes You Will Notice

The most dramatic change (and the most likely to be removed before the final release) is new default installation locations.

The first page of the Python 3.5 installer, showing "Install for All Users", "Install Just for Me", and "Customize installation" buttons.

Installing a copy of Python for all users on a machine and allowing everyone to modify it (the default under Python 3.4 and earlier) is a massive security hole. When installed into Program Files, only administrators can modify the shared files, and so users are better protected from malicious or accidental modifications.

Those who have used the Just for Me option in previous versions of Python are likely to have been surprised when it did not work as expected. For Python 3.5, this is now a true per-user installation. All files are installed into a directory than can only be accessed by the current user and the installation will work without administrative privileges.

The first two buttons on this page are single-click installs, meaning you’ll get all the default features and options, including pip and IDLE. For most users, these will dramatically simplify the process of installing Python.

However, many of us (myself included) like to be a bit more selective when we install Python. The third button, Customize installation, is for us.

The Optional Features page of the Python 3.5 installer, showing "pip", "tcl/tk and IDLE", and "Python test suite" checkboxes.

There are two pages of options. The first is a list of features that can be added or removed independently of the rest of the installation. Compared to the old-style tree view, the simple list of checkboxes makes it easier to see what each feature provides. This is also the screen you’ll see when you choose to modify an existing installation.

The Advanced Options page of the Python 3.5 installer.

The second page is advanced options, including the install location which (currently) defaults to the legacy directory, allowing you to install Python 3.5 identically to the older versions with the same amount of clicking. Right now, the options are basically identical to previous versions, but they are no longer mixed up with installable features. The way they are implemented has also been improved to be more reliable.

The success page of the Python 3.5 installer, showing a thankyou to Mark Hammond and links to online documentation.

From here, the rest of the installation proceeds as you’d expect. The final page retains the familiar message (thanks, Mark!) and also adds some links into the online documentation.

Changes You Will Not Notice

One interesting option you may have spotted on the Advanced Options page is a checkbox to install debugging symbols (.pdb files). These are handy if you work on or debug C extensions (for example, Python Tools for Visual Studio‘s mixed-mode C/Python debugging feature requires Python’s PDB files), and this is an easy way to install them. Previously the symbol files were available in a separate ZIP, but now they are just a checkbox away.

But wait, doesn’t this make the installer a larger download? Yes, or at least it would if the installer included the debugging symbols.

The biggest change to the installer is its architecture. Previously, it was a single MSI with a single embedded CAB that contained all the files. The new installer is a collection of MSIs (currently 12, though this could change), CABs (currently 16) and a single EXE. The EXE is the setup program shown above, while the CABs contain the install files and the MSIs have the install logic.

With this change, it is possible to lazily download MSIs and CABs as needed. Although it’s not marked in the screenshot above, the “Install debugging symbols” option will require an active internet connection and will download symbols on demand. In fact, it’s trivially easy to download all the components on demand, which reduces the initial download to less than 1MB.

My initial plan is to release four downloadable installers for Python 3.5.0 alpha1: two “web” installers (32-bit and 64-bit) and two “offline” installers that include the default components (download size is around 20MB, and it includes everything that was included in earlier versions). Depending on feedback and usage, this may change by the final release, but initially I want to offer useful alternatively without being too disruptive.

Another change that is part of the build process is code signing. Previously, only the installer was signed, which meant that undetectable changes could be made to python.exe or pythonXY.dll after installation. As part of reworking the installer, I’ve also moved to signing every binary that is part of a Python installation. This improves the level of trust for those who choose to validate signatures, as well as using the signed UAC dialog rather than the unsigned one when running Python as an administrator.

Changes For Administrators

For those who have scripted or automated Python installation from the old MSIs, things are going to change a bit. I believe these are for the better, as we never previously really documented and supported command-line installation, and I’ll be interested in the feedback from early adopters.

The first concern likely to arise is the web installers – how do I avoid downloading from the Python servers every time I install? What if I have to install on two hundred machines? Two thousand? The easiest way is to simply download everything once with the “/layout” option:

python-3.5.0a1.exe /layout \\shared\python\3.5.0a1

This will not install Python, but it will create a folder on a shared machine (or a local path) and download all the components into that folder. You can then run `python-3.5.0a1.exe` from that location and it will not need to download anything. Currently the entire layout is around 26MB for each of the 32-bit and 64-bit versions.

The files that make up the Python 3.5 installer layout

To silently install, you can run the executable with `/quiet` or `/passive`, and customisation options can be provided as properties on the command-line:

python-3.5.0a1.exe /quiet TargetDir=C:\Python35 InstallAllUsers=1 Include_pip=0 AssociateFiles=0 CompileAll=1

I’m not going to document the full list yet, as they may change up until the final release, but there will be a documentation page dedicated to installing and configuring Python on Windows.

How Can I Try This Out Early?

I’m still very actively working on this, but you can get my changes from hg.python.org/sandbox/steve.dower on the Installer branch. The build files are in Tools/msi and will (should) work with either Visual Studio 2013 or Visual Studio 2015 Preview.

Where Do I Complain About This?

I am keen to hear constructive feedback and concerns, so come and find the threads at python-dev. Nothing is unchangeable, and the Python community gets to have its say, though right now I’m looking to stabilise things up until alpha so please don’t be too upset if your suggestion doesn’t appear in the first release.

If you’re at all angry or upset, please make sure you’ve read the entire post before sharing that anger with everyone else. (That’s just general good advice.)

New New Project from Existing Code

Last time I wrote about the New Project from Existing Code wizard in Visual Studio and how we extended it to provide support for Python. This time, I’m going to look at the replacement for this dialog.

As discussed, there are a number of issues with reusing the managed languages wizard (C# and VB) for Python, largely due to the fact that it was never intended to be extensible. It is also difficult or impossible to (reliably) provide an alternate implementation, and in any case the feature is not very easy to discover. Because the usual way to create a new project normally involves skipping the menu and going straight to the list of templates, the aim was to add an item to this dialog:

New Project dialog with From Existing Python code selected

The three steps involved are creating the wizard, providing the necessary interfaces for Visual Studio and creating a template to start the wizard. The code is from changeset a8d12570c484, which was added prior to PTVS 1.5RC.

Designing a Wizard

Since this approach does not rely on reusing existing code, we were free to design the wizard in a way that flows nicely for Python developers and only exposes those options that are relevant, such as the interpreter to use with a project and the main script file. Because not every option needs to be used, and there are obviously ways to change them later, the wizard was set up with two pages. The first page collected the information that is essential to importing a project, this being the source of the files, a filter for file types and any non-standard search paths.

New Project from Existing Python Code Wizard page one

To avoid forcing the user to guess what each value means or how it behaves, lighter explanatory text is included directly in the window. (In earlier days these would have been tooltips, but with touch starting to become more prominent, requiring hovering is poor UI design.) The tone is deliberately casual and reassuring – one of the surprises people often find with Visual Studio is that adding an existing file will copy it into the project folder. When importing very large projects, it is far more desirable to leave the files alone and put the project in a nearby location. Because we support a ProjectHome element in our .pyproj files, we can treat any folder as the root of the project (this will no doubt be the subject of a post in the future).

The second page is entirely optional, and while it cannot be skipped entirely from the first, once users are familiar with the dialog it is very easy to click through without changing the default selections:

New Project from Existing Python Code Wizard page two

The two options on this page relate to Visual Studio settings, specifically, which version of Python should be used when running from within VS and which file should be used for F5/Ctrl+F5 execution (as opposed to using the “Start with/without Debugging” option on a specific file). Again, the light grey explanatory text reassures the user that any selection made here is not permanent and provides directions on how to make changes later. The second option – which file to run on F5 – also suggests that not all files will appear in the list. For performance reasons, only *.py (and *.pyw) files in the root directory are shown, since showing all files would require a recursive directory traversal (which is slow) and produce a much longer list of files (hard to navigate). Since Python does not allow the import statement to traverse up from the started script (in typical uses), most projects will have their main file in the root of the project. (That said, there is a chance that this aspect of the dialog will change for the next release, either by including all files, switching to a treeview or simply not being offered as an option.)

When “Finish” is clicked, the rest of the task is quite straightforward: the files in the provided path are scanned for all those matching the filter and the $variables$ in FromExistingCode.pyproj are replaced. This produces a .pyproj file that is then loaded normally. (Contrast with the other approach that creates an empty project and adds each file individually. This way is much faster.) Details are in the following section.

IWizard and replacementsDictionary

Template wizards in Visual Studio are implemented by providing the IWizard interface. The methods on this interface are called at various points to allow customisation of the template, but only one method is of interest here: RunStarted. The how-to page covers the process in detail, but the basic idea is that RunStarted displays the user interface and updates the set of replacement strings, which are then applied to existing template files.

The only template file used is FromExistingCode.pyproj, which contains five variables for replacement: $projecthome$, $startupfile$, $searchpaths$, $interpreter$ and $content$. While the first three are simple values, $interpreter$ will be replaced by the GUID and version (InterpreterId and InterpreterVersion properties) that represents the interpreter selected on page two of the wizard, and $content$ is replaced by the list of files and folders. Strictly, this is a slight misuse of the templating system, which is intended for values rather than code/XML generation, but it works and is quite efficient.

When RunStarted is called (by VS), a dictionary is provided for the wizard to fill in with these values. This means that a lot of processing takes place within this one function, which is generally not how callbacks like this should behave. However, in this case, it is perfectly appropriate to use modal dialogs and let exceptions propagate – in particular, WizardBackoutException should be thrown if the user cancels out of the dialog (unlike the WizardCancelledException, backing out returns the user to the template selection dialog).

RunStarted is implemented (along with the other methods) in ImportWizard.cs, with UI and XML generation separated into other methods to allow for easier testing.

103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
public void RunStarted(object automationObject, Dictionary<string, string> replacementsDictionary, WizardRunKind runKind, object[] customParams) {
    try {
        var provider = new ServiceProvider((Microsoft.VisualStudio.OLE.Interop.IServiceProvider)automationObject);
        var settings = ImportWizardDialog.ShowImportDialog(provider);
 
        if (settings == null) {
            throw new WizardBackoutException();
        }
 
        SetReplacements(settings, replacementsDictionary);
    } catch (WizardBackoutException) {
        try {
            Directory.Delete(replacementsDictionary["$destinationdirectory$"]);
        } catch {
            // If it fails (doesn't exist/contains files/read-only), let the directory stay.
        }
        throw;
    } catch (Exception ex) {
        MessageBox.Show(string.Format("Error occurred running wizard:\n\n{0}", ex));
        throw new WizardCancelledException("Internal error", ex);
    }
}

One point worth expanding on is the Directory.Delete call in the cancellation handler. Because this is a new project wizard, VS creates the destination directory based on user input before the wizard starts. However, if the wizard is cancelled from within RunStarted, as opposed to failing before RunStarted can be called, the directory is not removed. To prevent the user from seeing empty directories appear in their Projects folder, we try and remove it. (That said, we don’t try very hard – if the directory has ended up with files in it already then it will not be removed.)

The .vstemplate File

The final piece of this feature is adding the entry to the New Project dialog and starting the wizard when selected by the user. Templates in Visual Studio are typically added by including .vstemplate files in a registered folder (optionally creating a registering a folder specific to an extension). These files are XML and specify the template properties and the list of files to copy to the destination directory. For templates that include wizards, an optional WizardExtension element can be added, as seen in FromExistingCode.vstemplate:

16
17
18
19
<WizardExtension>
  <Assembly>Microsoft.PythonTools.ImportWizard, Version=1.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a</Assembly>
  <FullClassName>Microsoft.PythonTools.ImportWizard.Wizard</FullClassName>
</WizardExtension>

Importantly, the assembly name must be a fully qualified name, which is why we used a fixed version number for this file (the rest of our assemblies use an automatically generated build number, which is fine for compiled code but not so simple to include in a data file that is embedded in a .zip file). Wizard is the name of the class in ImportWizard.cs that implements IWizard, and so VS is able to instantiate Wizard and invoke RunStarted as part of creating the new project. The entire template is simply added to our existing templates directory, and VS discovers the .vstemplate file and includes it in the list.

One of the concerns with the previous version of this wizard was discoverability: it was not easy to find the feature. To completely solve this problem, we set the SortOrder value of the template to 10, which is very low and all but guarantees that it will always appear first in the list. So now the first option that will appear to both new and existing users of PTVS is a simple way to use their projects without having to add each file individually.

Summary

In this post we looked at the new New Project from Existing Code wizard that replaced the one from last time. Not using the existing implementation allowed us to design the wizard specifically for Python code and implement it efficiently. We also made the feature more discoverable by placing it first in the list of templates, which is how new projects are usually created. This feature was first released in Python Tools for Visual Studio 1.5RC.

New Project from Existing Code

In this post, we’ll be looking at how version one of a particular feature was implemented. The implementation was not very good (I can say that, since I contributed it), but it filled a necessary gap until it could be done better. Python Tools for Visual Studio 1.1 had this implementation, while PTVS 1.5 uses a completely different approach (that might be the subject of a future blog, but not this one).

When creating a new project in Visual Studio, you typically begin from a list of templates that are specific to your language:

Visual Basic project templates

If you already have a set of existing code, the obvious approach is to select “Empty Project” and add the files to that project. This situation is common when working on cross-platform projects, which often do not include Visual Studio projects, or when you want to migrate from earlier versions without introducing the compatibility ‘hacks’ provided by the upgrade wizard. An alternative to manually importing the files into an empty project is to use the “Project from Existing Code” wizard, which is available under the “New Project” submenu:

Project from Existing Code menu item

By default, this wizard only supports C#, VB and C++, but since very few Python projects ever come with project files, importing existing code is a very common task. This feature added basic support for Python projects to the existing wizard. The code is from changeset 156ceb1b1949, which was part of the original pull request that was merged into PTVS 1.0.

Adding Python to the list

The default wizard only provides support for C#, VB and C++.

Import Wizard window

Luckily, since Visual Studio is endlessly extensible, these project types are not hardcoded, but are found in the registry. For example, the C# entries are found in the Projects\{FAE04EC0-301F-11d3-BF4B-00C04F79EFBC}\ImportTemplates subkey of the Visual Studio key:

C# ImportTemplates registry entry

The meaning of each value is relatively straightforward, but to summarise:

WizardPageObjectAssembly
Full name of the assembly containing the wizard implementation.
WizardPageObjectClass
Full name of the call implementing the wizard.
ProjectType
The name (or a string resource within the project type’s package) to display in the dropdown box.
ImportProjectsDir
Directory containing the following three files.
ClassLibProjectFile
Filename of the project template for a class library project.
ConsoleAppProjectFile
Filename of the project template for a console application project.
WindowsAppProjectFile
Filename of the project template for a windowed application project.

Those last four items are specific to the Microsoft.VsWizards.ImportProjectFolderWizard.Managed.PageManager class, which is used for C# and VB projects only. The C++ wizard uses a different implementation that is very specific to C++, but the one for C# and VB is far more general and can be easily adapted to Python without requiring any new code (except to set up the registry entries). We could also provide a completely new class, but in the interest of minimising new code, that wasn’t done this time around.

To register the existing importer for Python projects, a ProvideImportTemplatesAttribute was created, since package attributes are how registry keys are added for PTVS (rather than specifying them in the installer). It was applied to the project package as:

[ProvideImportTemplates("#111",
                        PythonConstants.ProjectFactoryGuid,
                        "$PackageFolder$\\Templates\\Projects\\ImportProject",
                        "ImportWinApp.pyproj",
                        "ImportConApp.pyproj",
                        "ImportConApp.pyproj")]

The parameters map very closely to the values shown above for C#. PTVS has no concept of a library project, so we use a standard console project: the wizard implementation is… fragile… and while you could hope that leaving the class library template out would disable the option, in reality is simply causes an exception. Each template was created simply by using the existing template with the initial Program.py removed.

Adding this attribute is sufficient to make “Python Project” (the value of the “#111”) appear in the wizard, but not to actually import the project. This second step required a little more work inside the implementation of the PTVS project system.

The AddFromDirectory method

Among all the classes required to implement the project system, it is very rare to see dead code. Methods are only implemented if they are called, and as it turned out, a method required for importing existing code had never been written. This was the AddFromDirectory method in OAProjectItems.cs.

In brief (not that I can be much more brief than that MSDN page), AddFromDirectory recursively adds a folder and all the files and folders it contains into the project. The directory provided is returned as a new project folder (as if the user had clicked on “New Folder” themselves), which means that this function cannot actually be used to import the entire project. As a workaround, the wizard implementation adds any top-level files directly, then calls AddFromDirectory for each top-level folder.

Implementing AddFromDirectory is very simple, since the existing AddDirectory and AddFromFile methods can be used. The original implementation filtered files to only include those that could be imported from Python (in effect, *.py and *.pyw files in directories containing an __init__.py file), though this was later relaxed based on user feedback.

Problems

While this approach is very simple, it has a number of drawbacks. Those that have already been mentioned include an irrelevant option for a “Class Library” and the inability to filter top-level files and directories. These could be easily resolved by substituting another wizard class, but this is actually the point where the extensibility breaks down.

Import wizards are .NET classes that implement Microsoft.VsWizards.ImportProjectFolderWizard.IPageManager from the Microsoft.VisualStudio.ImportProjectFolderWizard.dll assembly that ships with Visual Studio. Unfortunately, this assembly is not stored anywhere that can be safely referenced without assuming the full VS installation path, cannot be redistributed (and probably cannot be loaded multiple times from different locations anyway), and is not guaranteed to be loaded before another assembly containing wizards. Basically, while it is possible to create a new wizard, it is not possible to safely load that wizard, making this idea pretty useless.

The final problem that was encountered was discoverability. Despite documentation, a video and announcements, it is not easy to find this feature, and even the other languages suffer from this. It would seem that people generally use other ways to open the New Project window (since there are at least four) and so never even see the “From Existing Code” option. In the 1.5 release of PTVS we’ve tried to solve this by providing the wizard as an item in the normal templates list:

Python project templates with PTVS 1.5

Summary

It is possible to extend the “New Project from Existing Code” wizard by providing some template files and registry entries. The project system that the wizard is for has to implement AddFromDirectory on project items, since this is the function that performs most of the work. Unfortunately, extending by providing a new wizard implementation is not possible because of how the dependent assemblies are distributed and loaded. This feature was originally in Python Tools for Visual Studio version 1.0, and was replaced in version 1.5 with a more appropriate and discoverable wizard.

Async/await in Python

Those who follow the development of C# and VB probably know about the recently added Async and Await keywords, which turn code like this:

int AccessTheWeb() {
    int result;
    Thread thread = new Thread((() => {
        HttpClient client = new HttpClient();
        string urlContents = client.GetString("http://msdn.microsoft.com");
        result = urlContents.Length;
    }));
    thread.Start();
    DoIndependentWork();
    thread.Join();
    return result;
}

Into code like this:

async Task<int> AccessTheWebAsync()
{ 
    HttpClient client = new HttpClient();
    Task<string> getStringTask = client.GetStringAsync("http://msdn.microsoft.com");
    DoIndependentWork();
    string urlContents = await getStringTask;
    return urlContents.Length;
}

For the last week or two, an intense discussion has been occurring on the python-ideas mailing list about asynchronous APIs, to which I’ve been contributing/supporting an approach similar to async/await. Since I wrote a 5000 word essay this week on the topic, I’m going to call that my post.

Read: Asynchronous API for Python.

Debugging Collections

Diving into the debugger again, we’re going to look at a very early feature that I had nothing to do with. Common to all programming languages (and at a deeper level, all computation) are data structures that contain other elements – “collections”. These may be arrays, lists, sets, mappings, or any other number of simple or complex types, each with their own preferred access method and characteristics.

What do I mean by “preferred access method”? In short, the most efficient manner to retrieve the element that you want. For example, arrays are best accessed using the element index, while (linked) lists prefer accessing elements adjacent to another. Sets are best used for mapping a value to true or false (as in, “does the set contain this element?”), and so on.

However, when it comes to debugging and introspection on a collection, the most useful access is complete enumeration. Retrieving every value in the collection and displaying it to the user allows many issues to be debugged that would otherwise be difficult or impossible. Those who are used to debugging in Visual Studio or another IDE will be familiar with the ability to view the contents of lists, sets and even list-like objects with no memory store (an Enumerable, in .NET):

Viewing collection contents in the C# debugger

Like .NET, Python has a range of collection types, including some that can be infinitely large without actually being stored in memory. Being able to view the elements in these collections is essential when debugging, so much so that Python Tools for Visual Studio had support in its earliest public releases.

This post will look at the messages sent between the debugger and debuggee, how the collection contents are exposed to the user, and the approaches used to manage unusual collection types. The code is from changeset baff92317760, which is an early implementation and some changes have been made since.

Debugger Messages

As we saw in my earlier blog on User-unhandled Exceptions, there are a number of messages that may be sent between the debugger (which is a C# component running within Visual Studio) and the debuggee (a script running in a Python interpreter). The command sent is CHLD and it is handled in visualstudio_py_debugger.py like this:

667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
self.command_table = {
    cmd('exit') : self.command_exit,
    cmd('stpi') : self.command_step_into,
    cmd('stpo') : self.command_step_out,
    cmd('stpv') : self.command_step_over,
    cmd('brkp') : self.command_set_breakpoint,
    cmd('brkc') : self.command_set_breakpoint_condition,
    cmd('brkr') : self.command_remove_breakpoint,
    cmd('brka') : self.command_break_all,
    cmd('resa') : self.command_resume_all,
    cmd('rest') : self.command_resume_thread,
    cmd('exec') : self.command_execute_code,
    cmd('chld') : self.command_enum_children,    cmd('setl') : self.command_set_lineno,
    cmd('detc') : self.command_detach,
    cmd('clst') : self.command_clear_stepping,
    cmd('sexi') : self.command_set_exception_info,
    cmd('sehi') : self.command_set_exception_handler_info,
}
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
def command_enum_children(self):
    # execute given text in specified frame
    text = read_string(self.conn)    tid = read_int(self.conn) # thread id
    fid = read_int(self.conn) # frame id
    eid = read_int(self.conn) # execution id
    child_is_enumerate = read_int(self.conn)
 
    thread = get_thread_from_id(tid)
    if thread is not None:
        cur_frame = thread.cur_frame
        for i in xrange(fid):
            cur_frame = cur_frame.f_back
 
        thread.enum_child_on_thread(text, cur_frame, eid, child_is_enumerate)

The two important parts to this handler are text and the enum_child_on_thread function. text is sent from the debugger and specifies the expression to obtain children for. This expression is compiled and evaluated in the context of the active call stack, which lets the user specify whatever they like in the Watch or Immediate windows, provided it is valid Python code.

We will look at enum_child_on_thread later, but it will eventually call report_children in order to send the results back to the debugger:

1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
def report_children(execution_id, children, is_index, is_enumerate):
    children = [(index, safe_repr(result), safe_hex_repr(result), type(result), type(result).__name__) for index, result in children]
 
    send_lock.acquire()
    conn.send(CHLD)
    conn.send(struct.pack('i', execution_id))
    conn.send(struct.pack('i', len(children)))
    conn.send(struct.pack('i', is_index))
    conn.send(struct.pack('i', is_enumerate))
    for child_name, obj_repr, hex_repr, res_type, type_name in children:        write_string(child_name)        write_object(res_type, obj_repr, hex_repr, type_name) 
    send_lock.release()

When the enumerated collection is sent back, it comes as a CHLD command again and in handled in PythonProcess.cs:

266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
switch (CommandtoString(cmd_buffer)) {
    case "EXCP": HandleException(socket); break;
    case "BRKH": HandleBreakPointHit(socket); break;
    case "NEWT": HandleThreadCreate(socket); break;
    case "EXTT": HandleThreadExit(socket); break;
    case "MODL": HandleModuleLoad(socket); break;
    case "STPD": HandleStepDone(socket); break;
    case "EXIT": HandleProcessExit(socket); return;
    case "BRKS": HandleBreakPointSet(socket); break;
    case "BRKF": HandleBreakPointFailed(socket); break;
    case "LOAD": HandleProcessLoad(socket); break;
    case "THRF": HandleThreadFrameList(socket); break;
    case "EXCR": HandleExecutionResult(socket); break;
    case "EXCE": HandleExecutionException(socket); break;
    case "ASBR": HandleAsyncBreak(socket); break;
    case "SETL": HandleSetLineResult(socket); break;
    case "CHLD": HandleEnumChildren(socket); break;    case "OUTP": HandleDebuggerOutput(socket); break;
    case "REQH": HandleRequestHandlers(socket); break;
    case "DETC": _process_Exited(this, EventArgs.Empty); break;
}
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
private void HandleEnumChildren(Socket socket) {
    int execId = socket.ReadInt();
    ChildrenInfo completion;
 
    lock (_pendingChildEnums) {
        completion = _pendingChildEnums[execId];
        _pendingChildEnums.Remove(execId);
    }
 
    int childCount = socket.ReadInt();
    bool childIsIndex = socket.ReadInt() == 1;
    bool childIsEnumerate = socket.ReadInt() == 1;
    PythonEvaluationResult[] res = new PythonEvaluationResult[childCount];
    for (int i = 0; i < res.Length; i++) {
        string expr = socket.ReadString();
        res[i] = ReadPythonObject(socket, completion.Text, expr, childIsIndex, childIsEnumerate, completion.Frame);    }
    completion.Completion(res);
}

The _pendingChildEnums is part of the stateless communication infrastructure that allows the UI to remain responsive while the results are being collected. An array res is created to contain the result and pass it to the visualiser.

Displaying Children

Visual Studio provides a number of tool windows that display variable values. The one shown in the image at the top of this post is the Locals window, which displays variables in the active scope. There is also the Watch window, which lets the user enter any expression they live, the Parallel Watch window, which does the same thing across multiple threads, and the Autos window, which chooses relevant variables automatically.

For all of these views, the values that are displayed are implementations of the IDebugProperty2 interface: in PTVS, the implementation is in AD7Property.cs. (These are passed out to VS from various places in the debugger which we won’t be looking at right now.) The method that is most relevant here is IDebugProperty2.EnumChildren:

public int EnumChildren(enum_DEBUGPROP_INFO_FLAGS dwFields, uint dwRadix, 
                        ref System.Guid guidFilter, enum_DBG_ATTRIB_FLAGS dwAttribFilter,
                        string pszNameFilter, uint dwTimeout, out IEnumDebugPropertyInfo2 ppEnum) {
    ppEnum = null;
    var children = _evalResult.GetChildren((int)dwTimeout);    if (children != null) {
        DEBUG_PROPERTY_INFO[] properties = new DEBUG_PROPERTY_INFO[children.Length];
        for (int i = 0; i < children.Length; i++) {
            properties[i] = new AD7Property(_frame, children[i], true)                .ConstructDebugPropertyInfo(dwRadix, dwFields);        }
        ppEnum = new AD7PropertyEnum(properties);
        return VSConstants.S_OK;
    }
    return VSConstants.S_FALSE;
}

This method is called when the variable is expanded in VS. The call to _evalResult.GetChildren performs the communication described earlier, and will block until the list of the collection contents is available or the timeout expires. New AD7Property instances are created for each returned expression, allowing them to also be displayed in the variable windows. If they are expandable, they can in turn be expanded and have their elements displayed.

The other method of interest is IDebugProperty2.GetPropertyInfo, which returns a filled DEBUG_PROPERTY_INFO structure consisting mostly of displayable strings. (In fact, AD7Property only implements one method other than these two, which is SetValueAsString. IDebugProperty2 really does only serve a single purpose.) These strings are what are displayed in Visual Studio:

Viewing collection contents in the Python debugger

Looking into Collections

Now that we’ve seen how the debugger and debuggee communicate with each other, and how the debugger communicates with Visual Studio, let’s have a look at obtaining the values of the collections.

The actual work is performed in visualstudio_py_debugger.py in the enum_child_locally function. Obtaining the members is surprisingly simple, since the debugger is wrritten in Python and all supported collections have a consistent iteration interface:

493
494
495
496
497
498
if hasattr(res, 'items'):
    # dictionary-like object
    enum = res.items()
else:
    # indexable object
    enum = enumerate(res)

Dictionaries require special handling, since normal enumeration only includes keys and not values, but all other iterable objects can be passed directly to enumerate. Both cases of the snippet shown result in enum containing a sequence of key-value tuples, where the key for non-dictionary collections is the 0-based index. The section of code following this converts the sequence to a list.

A large amount of exception handling exists to safely handle ‘bad’ objects. Any developer can implement their own __iter__(), __next__() (nee next()) or items() methods, potentially causing issues in our code. When an exception is thrown, we either skip that element (which also skips that index) or simply abandon enumeration completely. However, there are two types of ‘collections’ that need special handling.

The first are infinite iterators, which are a “list-like object with no memory store” like we saw earlier. When one of these is encountered, there is no way to discover it in advance. A timeout is used to prevent Visual Studio from waiting forever. However, because of the stateless nature of the debugger/debuggee communication, even though VS has decided it is no longer interested, the Python code will continue creating an infinitely long list of objects to return (until a MemoryError is raised, and then the list and the strings are deallocated and it’s as if nothing ever happened).

A relatively simple fix is used here: the number of elements returned is capped at 10,000. Because it is completely unwieldly to view 10,000 elements in the variable windows, most users will never encounter this limit. At the same time, fewer users are unlikely to see iterables displaying no elements, and infinite iteration errors are often identifiable from a short subsequence that would otherwise not be displayed. So while it looks like a hack, the end result is a better experience. (And you can view any element you like by adding ‘[<index>]’ after the name in the Watch window, even well beyond the 10,000 element limit.)

The other ‘collection’ that receives special handling are objects themselves. You have probably noticed throughout the code samples and SDK documentation that the “expandable” objects contain “children”. This functionality is not just used for collections, but also to allow objects to expand and list their members. If all the code above fails with an exception (which will typically be TypeError when attempting to iterate a non-iterable object), the following code runs:

526
527
528
529
530
531
532
533
534
535
536
# non-indexable object, return attribute names, filter callables
items = []
for name in dir(res):    if not (name.startswith('__') and name.endswith('__')):        try:
            item = getattr(res, name)
            if not hasattr(item, '__call__'):                items.append( (name, item) )
        except:
            # skip this item if we can't display it...
            pass

This uses Python’s dir function to obtain a list of all the members of the object, filters out private (__special_name__) members and callable methods, and returns the rest in a similar fashion to the members of a dictionary.

Summary

Debugger support for enumerating collections allows Python Tools for Visual Studio to display members of collections in the user’s code. Visual Studio provides the user interface for the functionality, requiring only minimal implementation on the part of the language-specific debugger. As well as collections, PTVS (and other VS languages) use this to expand regular objects and display their member variables. This feature was part of the very first releases of PTVS.

Smart Indentation for Python

The text editor in Visual Studio provides a number of options related to indentation. Apart from the standard tabs/spaces and “how many” options, you can choose between three behaviours: “None”, “Block” and “Smart.”

The 'Tabs' options dialog in Visual Studio 2012

To my knowledge, the “None” mode is very rarely used. When enabled, pressing Enter will move the caret (the proper name for the text cursor) to the first column of the following line. While this is the normal behaviour in a word processor, it does not suit programming quite as well.

“Block” mode is more useful. In this mode, pressing Enter will move the caret to the following line and insert as many spaces/tabs as appeared at the start of the previous line. This ensures that consecutive lines of text all start in the same column by default, which is commonly used as a hint to the programmer that they are part of the same block.

However, the most common (and default) mode is “Smart.” Unlike the other two modes, which are built into the editor, smart indentation is provided by a language service (which are also responsible for providing syntax highlighting and completions). Because they are targeted to a specific language, they can help the programmer by automatically adding and removing extra indentation in ways that make sense.

For example, the smart indentation for C++ will automatically add an indent after an open brace, which begins a new block, and remove an indent for a close brace, which ends the block. Similarly, an indent is added after “if” or “while” statements, as well as the others that support implicit blocks, and removed after the one statement that may follow. In most cases, a programmer can simply continue typing without ever having to manually indent or unindent their code.

This feature has existed since very early in Python Tools for Visual Studio‘s life, but the implementation has changed significantly over time. In this post we will look at the algorithms used in changesets 41aa3fe86341 and 4db951c455d5, as well as the general approach to providing smart indentation from a language service.

ISmartIndentProvider

Since Visual Studio 2010, language services have provided smart indenting by exporting an MEF component implementing ISmartIndentProvider. This interface includes one method, CreateSmartIndent, which returns an object implementing ISmartIndent for a provided text view. ISmartIndent also only includes one method (ignoring Dispose), GetDesiredIndentation, which returns the number of spaces to indent by. VS will convert this to tabs (or a mix of tabs and spaces) depending on the user’s settings.

The entire implementation of these interfaces in PTVS looks like this:

[Export(typeof(ISmartIndentProvider))]
[ContentType(PythonCoreConstants.ContentType)]
public sealed class SmartIndentProvider : ISmartIndentProvider {
    private sealed class Indent : ISmartIndent {
        private readonly ITextView _textView;
 
        public Indent(ITextView view) {
            _textView = view;
        }
 
        public int? GetDesiredIndentation(ITextSnapshotLine line) {
            if (PythonToolsPackage.Instance.LangPrefs.IndentMode == vsIndentStyle.vsIndentStyleSmart) {
                return AutoIndent.GetLineIndentation(line, _textView);
            } else {
                return null;
            }
        }
 
        public void Dispose() { }
    }
 
    public ISmartIndent CreateSmartIndent(ITextView textView) {
        return new Indent(textView);
    }
}

The AutoIndent class referenced in GetDesiredIndentation contains the algorithm for calculating how many spaces are required. Two algorithms for this are described in the following sections, the first using reverse detection and the second using forward detection.

Reverse Indent Detection

This algorithm was used in PTVS up to changeset 41aa3fe86341, which was shortly before version 1.0 was released. It was designed to be efficient by minimising the amount of code scanned, but ultimately got so many corner cases wrong that it was easier to replace it with the newer algorithm. The full source file is AutoIndent.cs.

At its simplest, indent detection in Python is based entirely on the preceding line. The normal case is to copy the indentation from that line. However, if it ends with a colon then we should add one level, since that is how Python starts blocks. Also, we can safely remove one level if the previous line is a return, raise, break or continue statement, since no more of that block will be executed. (As a bonus, we also do this after a pass statement, since its main use is to indicate an empty block.) The complications come when the preceding textual line is not the preceding line of code.

Take the following example:

1
2
if a == 1 and (b == 2 or
               c == 3):

How many spaces should we add for line 3? According to the above algorithm, we’d add 15 plus the user’s indent size setting (for the colon on line 2). This is clearly not correct, since the if statement has 0 leading spaces, but it is the result when applying the simpler algorithm.

Finding the start of an expression is actually such a common task in a language service that PTVS has a ReverseExpressionParser class. It tokenises the source code as normal, but rather than parsing from the start it parses backwards from an arbitrary point. Since the parser state (things like the number of open brackets) is unknown, the class is most useful for identifying the span of a single expression (hence the name).

For smart indentation, the parser is used twice: once to find the start of the expression assuming we’re not inside a grouping (the zero on line 102) and once assuming we are inside a grouping (the one on line 103).The span provided on line 94 is the location of the last token before the caret, which for smart indenting should be an end of line character (that the parser automatically skips).

90
91
92
93
94
95
96
97
98
99
100
101
102
103
// use the expression parser to figure out if we're in a grouping...
var revParser = new ReverseExpressionParser(
        line.Snapshot,
        line.Snapshot.TextBuffer,
        line.Snapshot.CreateTrackingSpan(
            tokens[tokenIndex].Span.Span,
            SpanTrackingMode.EdgePositive
        )
    );
 
int paramIndex;
SnapshotPoint? sigStart;
var exprRangeNoImplicitOpen = revParser.GetExpressionRange(0, out paramIndex, out sigStart, false);
var exprRangeImplicitOpen = revParser.GetExpressionRange(1, out paramIndex, out sigStart, false);

The values of exprRangeNoImplicitOpen and exprRangeImplicitOpen are best described by example. Consider the following code:

1
2
3
4
5
6
7
def f(a, b, c):
    if a == b and b != c:
        return (a +
                b * c
                + 123)
    while a < b:
        a = a + c

When parsing starts at the end of line 4, exprRangeNoImplicitOpen will reference the span containing b * c, since that is a complete expression (remembering that the parser does not know it is inside parentheses). exprRangeImplicitOpen is initialised with one open grouping, so it will reference (a + b * c. However, if we start parsing at the end of line 7, exprRangeNoImplicitOpen will reference a + c while exprRangeImplicitOpen will be null, since an assignment within a group would be an error.

Using the two expressions, we can create a new set of indentation rules:

  • If exprRangeImplicitOpen was found, exprRangeNoImplicitOpen was not (or is different to exprRangeImplicitOpen), and the expression starts with an open grouping ((, [ or {), we are inside a set of brackets.
    • In this case, we match the indentation of the brackets + 1, as on line 2 of the earlier example.
  • Otherwise, if exprRangeNoImplicitOpen was found and it is preceded by a return or raise, break or continue statement, OR if the last token is one of those keywords, the previous line must be one of those statements.
    • In this case, we copy the indentation and reduce it by one level for the following line.
  • Otherwise, if both ranges were found, we have a valid expression on one line and one that spans multiple lines.
    • This occurred in the example shown above.
    • In this case, we find the lowest indentation on any line of the multi-line expression and use that.
  • Otherwise, if the last non-newline character is a colon, we are at the first line of a new block.
    • In this case, we copy the indentation and increase it by one level.

These rules are implemented on lines 105 through 143 of AutoIndent.cs. However, with this approach there are many cases that need special handling. Most of the above ‘rules’ are the result of these being discovered. For example, issue 157 goes through a lot of these edge cases, and while most of them were resolved, it remained a less-than-robust algorithm. The alternative approach, described below, was added to handle most of these issues directly rather than as workarounds.

Forward Indent Detection

This algorithm replaced the reverse algorithm for PTVS 1.0 and has been used since with very minor modifications. It potentially sacrifices some performance in order to obtain more consistent results, as well as being able to support a slightly wider range of interesting rules. The discussion here is based on the implementation as of changeset 4db951c455d5; full source at AutoIndent.cs.

For this algorithm, the reverse expression parser remains but is used slightly differently. Its definition was changed slightly to allow external code to enumerate tokens from it (by adding an IEnumerable<ClassificationSpan> implementation) and its IsStmtKeyword() method was made public. This allows AutoIndent.GetLineIndentation() to perform its own parsing:

109
110
111
112
113
114
115
116
117
118
119
120
121
122
var tokenStack = new System.Collections.Generic.Stack<ClassificationSpan>();
tokenStack.Push(null);  // end with an implicit newline
bool endAtNextNull = false;
 
foreach (var token in revParser) {
    tokenStack.Push(token);
    if (token == null && endAtNextNull) {
        break;
    } else if (token != null &&
       token.ClassificationType == revParser.Classifier.Provider.Keyword &&
       ReverseExpressionParser.IsStmtKeyword(token.Span.GetText())) {
        endAtNextNull = true;
    }
}

The result of this code is a list of tokens leading up to the current location and guaranteed to start from outside any grouping. Depending on the structure of the preceding code, this may result in quite a large list of tokens; only a statement that can never appear in an expression will stop the reverse parse. This specifically excludes if, else, for and yield, which can all appear within expressions, and so all tokens up to the start of the method or class may be included. This is unfortunate, but also required to make guarantees about the parser state without parsing the entire file from the beginning (which is the only other known state).

Since the parser state is known at the first token, we can parse forward and track the indentation level. The algorithm now looks like this as we visit each token in order (by popping them off of tokenStack):

  • At a new line, set the indentation level to however many spaces appear at the start of the line.
  • At an open bracket, set the indentation level to the column of the bracket plus one and remember the previous level.
    • This ensures that if we reach the end of the list while inside the group, our current level lines up with the group and not the start of the line.
  • At a close bracket, restore the previous indentation level.
    • This ensures that whatever indentation occurs within a group, we will use the original indentation for the line following.
  • At a line continuation character (a backslash at the end of a line), skip ahead until the end of the entire line of code.
  • If the token is a statement to unindent after (return and friends), set a flag to unindent.
    • This flag is preserved, restored and reset with the indentation level.
  • If the token is a colon character and we are not currently inside a group, set a flag to add an indent.
    • And, if the following token is not an end-of-line token, also set the unindent flag.

After all tokens have been scanned, we will have the required indentation level and two flags indicating whether to add or remove an indent level. These flags are separate because they may both be set (for example, after a single-line if statement such as if a == b: return False). If they don’t cancel each other out, then an indent should be added or removed to the calculated level to find where the next line should appear:

167
168
169
indentation = current.Indentation +
    (current.ShouldIndentAfter ? tabSize : 0) -
    (current.ShouldDedentAfter ? tabSize : 0);

The implementation of this algorithm uses a LineInfo structure and a stack to manage preserving and restoring state:

44
45
46
47
48
49
50
private struct LineInfo {
    public static readonly LineInfo Empty = new LineInfo();
    public bool NeedsUpdate;
    public int Indentation;
    public bool ShouldIndentAfter;
    public bool ShouldDedentAfter;
}

And the structure of the parsing loop looks like this (edited for length):

while (tokenStack.Count > 0) {
    var token = tokenStack.Pop();
    if (token == null) {
        current.NeedsUpdate = true;
    } else if (token.IsOpenGrouping()) {
        indentStack.Push(current);
        ...
    } else if (token.IsCloseGrouping()) {
        current = indentStack.Pop();
        ...
    } else if (ReverseExpressionParser.IsExplicitLineJoin(token)) {
        ...
    } else if (current.NeedsUpdate == true) {
        current.Indentation = GetIndentation(line.GetText(), tabSize)
        ...
    }
 
    if (ShouldDedentAfterKeyword(token)) {
        current.ShouldDedentAfter = true;
    }
 
    if (token != null && token.Span.GetText() == ":" && indentStack.Count == 0) {
        current.ShouldIndentAfter = true;
        current.ShouldDedentAfter = (tokenStack.Count != 0 && tokenStack.Peek() != null);
    }
}

A significant advantage of this algorithm over the reverse indent detection is its obviousness. It is much easier to follow the code for the parsing loop than to attempt to interpret the behaviour and interactions inherent in the reverse algorithm. Further, modifications can be more easily added because of the clear structure. For example, the current behaviour indents the contents of a grouping to the level of the opening token, but some developers prefer to only add one indent level and no more. With the reverse algorithm, finding the section of code requiring a change is difficult, but the forward algorithm has an obvious code path at the start of groups.

Summary

Smart indentation allows Python Tools for Visual Studio to assist the developer by automatically indenting code to the level it is usually required at. Since Python uses white-space to define scopes, much like C-style languages use braces, this makes writing Python code simpler and can reduce “inconsistent-whitespace” errors. Language services provide this feature by exporting (through MEF) an implementation of ISmartIndentProvider. This post looked at two algorithms for determining the required indentation based on the preceding code, the latter of which shipped with PTVS 1.0.

User-unhandled Exceptions

This feature is one that I added to Python Tools for Visual Studio as an intern in 2011. Those familiar with debugging in Visual Studio will know that if your code throws an exception that is not handled, the debugger breaks at the statement that caused the error:

This feature is known as a user-uncaught exception and is incredibly useful in debugging, since it provides an opportunity to inspect local state before the stack unwinds. It is optional but defaults to on for most exceptions:

The “thrown” column breaks before the system traverses the stack looking for a handler (the “first chance”), while the “user-unhandled” column breaks after traversal if no code chooses to handle it, but before the stack actually begins to unwind (the “second chance”).

In Python, an unhandled exception will unwind the stack immediately, executing finally blocks and otherwise destroying state as it goes, printing a basic trace at the end. In other words, the first traversal step does not exist in Python and there is no second chance. Before adding this feature, PTVS only supported breaking on the first chance:

The aim of this feature was to mimic other languages by simulating the first traversal of exception handlers without modifying the program state. Step one of this task was to enable the UI, step two was to find a way to detect active exception handlers, and step three was to extend the debugger to support the feature.

To follow along at home, the changeset is baff92317760, which I will quote from (and link to) where relevant. (The code has changed again since this commit, but these are the relevant ones for this discussion.) I also recorded a short video around the time this feature was added to demonstrate how it works.

Enabling the UI

Enabling the check boxes for user-unhandled exceptions is simply a case of changing the registration. Each debugging engine can list the exceptions it supports under AD7Metrics\Exception\{engine_guid}\{exception_name}. This allows the user to select whether or not to break on these exceptions, while the debugging engine is still fully responsible for handling the break itself.

Within these keys is a State value that combines values from the
EXCEPTION_STATE enumeration. The values we used are EXCEPTION_STOP_USER_UNCAUGHT and EXCEPTION_JUST_MY_CODE_SUPPORTED (0x4020):

The state value for Python exception ArithmeticError is set to 0x4020

In PTVS, the exceptions are registered using the ProvideDebugException attribute on PythonToolsPackage.cs, which includes a state parameter for setting this value. By modifying ProvideDebugExceptionAttribute.cs we can simply change the default for all of our exceptions:

    _state = (int)(enum_EXCEPTION_STATE.EXCEPTION_JUST_MY_CODE_SUPPORTED | 
                   enum_EXCEPTION_STATE.EXCEPTION_STOP_USER_UNCAUGHT);

After adding this line to the attribute constructor and rebuilding, the dialog has the checkboxes enabled:

However, simply making the option available does not provide any functionality, since it is entirely managed by Visual Studio. The options selected by the user are exposed to the debugger through the IDebugEngine2.SetException method, implemented in AD7Engine.cs. The following changes were made to support both forms of exception handling:

         int IDebugEngine2.SetException(EXCEPTION_INFO[] pException) {
+            bool sendUpdate = false;
             for (int i = 0; i < pException.Length; i++) {
                 if (pException[i].guidType == DebugEngineGuid) {
+                    sendUpdate = true;
                     if (pException[i].bstrExceptionName == "Python Exceptions") {
-                        _defaultBreakOnException = true;
+                        _defaultBreakOnExceptionMode =
+                            (int)(pException[i].dwState & (enum_EXCEPTION_STATE.EXCEPTION_STOP_FIRST_CHANCE | enum_EXCEPTION_STATE.EXCEPTION_STOP_USER_UNCAUGHT));
                     } else {
-                        _breakOnException.Add(pException[i].bstrExceptionName);
+                        _breakOnException[pException[i].bstrExceptionName] = 
+                            (int)(pException[i].dwState & (enum_EXCEPTION_STATE.EXCEPTION_STOP_FIRST_CHANCE | enum_EXCEPTION_STATE.EXCEPTION_STOP_USER_UNCAUGHT));
                     }
                 }
             }
 
-            _process.SetExceptionInfo(_defaultBreakOnException, _breakOnException);
+            if (sendUpdate) {
+                _process.SetExceptionInfo(_defaultBreakOnExceptionMode, _breakOnException);
+            }
             return VSConstants.S_OK;
         }

One trick here is that this method is only called where a setting is different from its default, which means that the debug engine has to be aware of all the default values. For example, we made AttributeError not break by default, since it is often handled in a way we cannot detect, but now SetExceptionInfo only receives a notification for when handling is enabled, and so the debug engine has to be aware that unless it is told otherwise, it should not break on AttributeError. The changes required to make it break when the exception is thrown are discussed later.

Detecting Exception Handlers

As described above (briefly), Python uses a different approach to thrown exceptions than other languages. The main difference is that exception handlers are checked as the stack unwinds, so by the time Python code can be sure that an exception is not handled, the stack is gone, all finally blocks have been run and only the traceback remains. The traceback contains source files and line numbers, but no information about local variables or values. Ideally, the debugger would detect that an exception in unhandled without unwinding and break in before the call stack is destroyed.

Since we are restricted to static analysis, a fallback position is necessary. The implementation assumes that all exceptions are unhandled unless proven otherwise, which makes us more likely to break when an exception occurs. The alternative would result in more stack traces and less breaking. Since the developer is already debugging their code, it is fair to assume that a false positive (breaking when a handler exists) is preferable to a false negative (not breaking when there is no handler).

In terms of implementation, the GetHandledExceptionRanges method in PythonProcess.cs does the main detection work:

309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
internal IList<Tuple<int, int, IList<string>>> GetHandledExceptionRanges(string filename) {
    PythonAst ast;
    TryHandlerWalker walker = new TryHandlerWalker();    var statements = new List<Tuple<int, int, IList<string>>>();
 
    try {
        using (var source = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read)) {
            ast = Parser.CreateParser(source, LanguageVersion).ParseFile();
            ast.Walk(walker);
        }
    } catch (Exception ex) {
        Debug.WriteLine("Exception in GetHandledExceptionRanges:");
        Debug.WriteLine(string.Format("Filename: {0}", filename));
        Debug.WriteLine(ex);
        return statements;
    }
 
    foreach (var statement in walker.Statements) {
        int start = statement.GetStart(ast).Line;        int end = statement.Body.GetEnd(ast).Line + 1;        var expressions = new List<string>();
 
        if (statement.Handlers == null) {
            expressions.Add("*");        } else {
            foreach (var handler in statement.Handlers) {
                Expression expr = handler.Test;
                TupleExpression tuple;
                if (expr == null) {
                    expressions.Clear();
                    expressions.Add("*");                    break;
                } else if ((tuple = handler.Test as TupleExpression) != null) {
                    foreach (var e in tuple.Items) {
                        var text = ToDottedNameString(e, ast);
                        if (text != null) {
                            expressions.Add(text);                        }
                    }
                } else {
                    var text = ToDottedNameString(expr, ast);
                    if (text != null) {
                        expressions.Add(text);                    }
                }
            }
        }
 
        if (expressions.Count > 0) {
            statements.Add(new Tuple<int, int, IList<string>>(start, end, expressions));        }
    }
 
 
    return statements;
}

Given a filename, we parse the file and produce a syntax tree. With TryHandlerWalker we simply collect all the try nodes, each of which also contains the list of caught expressions and, importantly, the line numbers of the try statement and the first except statement. This information is sent to the debugger in response to a REQH command (discussed below).

Later, the debugger will parse expressions from the except statements to compare against the thrown exception. Line numbers are sufficient to determine whether code is in a try block (since try and except are only valid at the start of a line) but anything can be specified as the except expression. For example, the following is working (perhaps not valid, and certainly not easy to read) code:

try:
    do_something()
except get_exception_types()[1]:    print("handled")

When we are simulating a stack unwind, we cannot call the function to find out what it returns, firstly because we are not in the correct call frame and secondly because of potential side effects. (The closest thing to a sensible use of a call as an except expression is logging, which would mean that if we called it we would produce spurious messages.)

To handle as many safe cases as possible, the debugger will only look up names and modules. As soon as anything more complicated appears in an expression list, we assume that it does not handle the current exception and keep looking for another handler. Because we have access to the call frames, we are able to look up names in the correct scope. For example, in the following code we will determine that the exception has a handler, even though its name (zde) is defined in a different call frame than where the exception is thrown:

def test_1(x):
    return 100 / x 
def test_2(x):
    zde = ZeroDivisionError    try:
        return test_1(x)
    except zde:        return 0

An issubclass call is used to determine whether the exception will be caught by each possible handler.1 When it returns true (or a plain except: is encountered), execution is continued and the debugger is not notified of the exception. If we reach a call frame that does not have code available (meaning we can’t get the handlers) or that belongs to the debugger itself, we assume the exception is unhandled and break.

1Of course, this means that arbitrary code could still be executed in the form of a __subclasscheck__ method. This method should not have side effects anyway, so all we’d really achieve by calling it is to expose a bug earlier.

Extending the Debugger

In order for the debugger to support the updated exception handling mechanism, we have to add the new REQH (REQuest Handlers) and SEHI (Set Exception Handler Info) commands. Commands are used for stateless communicate between the Python process being debugged and the Visual Studio instance doing the debugging. REQH will be sent from the debuggee to request the list of exception handlers in a particular source file and SEHI is sent with the response. PythonProcess.cs contains the handling for REQH events:

266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
switch (CommandtoString(cmd_buffer)) {
    case "EXCP": HandleException(socket); break;
    case "BRKH": HandleBreakPointHit(socket); break;
    case "NEWT": HandleThreadCreate(socket); break;
    case "EXTT": HandleThreadExit(socket); break;
    case "MODL": HandleModuleLoad(socket); break;
    case "STPD": HandleStepDone(socket); break;
    case "EXIT": HandleProcessExit(socket); return;
    case "BRKS": HandleBreakPointSet(socket); break;
    case "BRKF": HandleBreakPointFailed(socket); break;
    case "LOAD": HandleProcessLoad(socket); break;
    case "THRF": HandleThreadFrameList(socket); break;
    case "EXCR": HandleExecutionResult(socket); break;
    case "EXCE": HandleExecutionException(socket); break;
    case "ASBR": HandleAsyncBreak(socket); break;
    case "SETL": HandleSetLineResult(socket); break;
    case "CHLD": HandleEnumChildren(socket); break;
    case "OUTP": HandleDebuggerOutput(socket); break;
    case "REQH": HandleRequestHandlers(socket); break;    case "DETC": _process_Exited(this, EventArgs.Empty); break;
}
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
private void HandleRequestHandlers(Socket socket) {
    string filename = socket.ReadString(); 
    Debug.WriteLine("Exception handlers requested for: " + filename);
    var statements = GetHandledExceptionRanges(filename); 
    _socket.Send(SetExceptionHandlerInfoCommandBytes);    SendString(_socket, filename);
 
    _socket.Send(BitConverter.GetBytes(statements.Count));
 
    foreach (var t in statements) {
        _socket.Send(BitConverter.GetBytes(t.Item1));
        _socket.Send(BitConverter.GetBytes(t.Item2));
 
        foreach (var expr in t.Item3) {
            SendString(_socket, expr);
        }
        SendString(_socket, "-");
    }
}

The GetHandledExceptionRanges method was shown above; HandleRequestHandlers calls this method and transmits the line numbers and exception types to the debuggee. The implementation on the debuggee side is in visualstudio_py_debugger.py. The handling for the SEHI command looks like this:

667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
self.command_table = {
    cmd('exit') : self.command_exit,
    cmd('stpi') : self.command_step_into,
    cmd('stpo') : self.command_step_out,
    cmd('stpv') : self.command_step_over,
    cmd('brkp') : self.command_set_breakpoint,
    cmd('brkc') : self.command_set_breakpoint_condition,
    cmd('brkr') : self.command_remove_breakpoint,
    cmd('brka') : self.command_break_all,
    cmd('resa') : self.command_resume_all,
    cmd('rest') : self.command_resume_thread,
    cmd('exec') : self.command_execute_code,
    cmd('chld') : self.command_enum_children,
    cmd('setl') : self.command_set_lineno,
    cmd('detc') : self.command_detach,
    cmd('clst') : self.command_clear_stepping,
    cmd('sexi') : self.command_set_exception_info,
    cmd('sehi') : self.command_set_exception_handler_info,}
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
def command_set_exception_handler_info(self):
    try:
        filename = read_string(self.conn) 
        statement_count = read_int(self.conn)
        handlers = []
        for _ in xrange(statement_count):
            line_start, line_end = read_int(self.conn), read_int(self.conn) 
            expressions = set()
            text = read_string(self.conn).strip()
            while text != '-':
                expressions.add(text)
                text = read_string(self.conn)
 
            if not expressions:
                expressions = set('*')
 
            handlers.append((line_start, line_end, expressions)) 
        BREAK_ON.handler_cache[filename] = handlers    finally:
        BREAK_ON.handler_lock.release()

Notice that the filename is sent both ways. Because the protocol is stateless, the debuggee cannot automatically associate the response with its request. In practice, it will block until it receives the response it expects, but it would require no protocol modifications to support preemptively sending SEHI commands. To avoid parsing source files repeatedly, the debuggee caches the handler lists, which are stored as tuples containing the same information found by GetHandledExceptionRanges.

Breaking on an exception is relatively straightforward. The handler for when an exception is thrown already had a ShouldBreak test, which was changed to check the mode for the exception as, if necessary, whether any handlers exist.

358
359
360
def handle_exception(self, frame, arg):
    if frame.f_code.co_filename != __file__ and BREAK_ON.ShouldBreak(self, *arg):        self.block(lambda: report_exception(frame, arg, self.id))
-    def ShouldBreak(self, name):
-        return self.break_always or name in self.break_on
 
+    def ShouldBreak(self, thread, ex_type, ex_value, trace):
+        name = ex_type.__module__ + '.' + ex_type.__name__
+        mode = self.break_on.get(name, self.default_mode)
+        return (bool(mode & BREAK_MODE_ALWAYS) or+                (bool(mode & BREAK_MODE_UNHANDLED) and not self.IsHandled(thread, ex_type, ex_value, trace)))

The IsHandled method performs the call-stack traversal described earlier, returning False at the first frame that has no code available or True at the first frame that has a handler matching the exception type:

    # Edited for length
    def IsHandled(self, thread, ex_type, ex_value, trace):
        ... 
        cur_frame = trace.tb_frame
 
        while should_send_frame(cur_frame) and cur_frame.f_code.co_filename is not None:
            handlers = self.handler_cache.get(cur_frame.f_code.co_filename)
 
            if handlers is None:
                # get handlers, or assume unhandled and return False
                ...
 
            line = cur_frame.f_lineno
            for line_start, line_end, expressions in handlers:
                if line_start <= line < line_end:
                    if '*' in expressions:
                        return True
 
                    for text in expressions:
                        res = lookup_local(cur_frame, text)
                        if res is not None and issubclass(ex_type, res):
                            return True
 
            cur_frame = cur_frame.f_back
 
        return False

The search starts from the first frame in the traceback, which is the frame where the exception was caused, and goes backwards through f_back (that is, towards the caller) searching each file’s handler list. If no handlers are cached, they are requested from the debugger, and the linear search and use of line ranges ensures the correct behaviour for nested try blocks.

Summary

This feature added the ability for Python Tools for Visual Studio to detect and break on unhandled exceptions without unwinding the stack. This allows developers to inspect the state of their program and not just the stack trace. Handler detection is based on simple source code analysis, performed when an exception has been thrown, and assumes that if it can’t guarantee that the exception has a handler then it should break. It was first released in PTVS 1.0, August 2011.