Implementing a Terraform Provider

Hetzer Cloud is a cheap cloud provider with datacenters in Germany and a nifty API. I used it to host a video conferencing platform for my kids’ school.

While there is an excellent Terraform provider for Hetzner Cloud, I couldn’t find one for their DNS service and decided to implement a provider on my own. You can find the Terraform provider hetznerdns on GitHub.

In this post I share what would have saved me some hours searching the web and what I wish I knew before.

Removing Resources from Local State

There is a section in the writing custom providers guide about implementing a read callback. This function reads the actual state of a resource, let’s say a DNS record, and syncs it with the local state.

Most providers I looked into used the ID from the local state to get the actual state from a API. In the guide, if the API client returns an error, the resource is deleted from local state. This is the correct behavior, if a resource was really deleted manually for example. The assumption here is that any error means the resource doesn’t exist anymore.

That’s not true in my specific implementation. If the API returns a HTTP 500, or isn’t available at all when Terraform refreshs the state, the client would return an error. But the resource would still exist. In this case you don’t want to do that.

func resourceRecordRead(d *schema.ResourceData, m interface{}) error {
	client := m.(*api.Client)
  id := d.Id()
  
	record, err := client.GetRecord(id)
	if err != nil {
		return fmt.Errorf("Error getting record with id %s: %s", id, err)
	}

	if record == nil {
    log.Printf("[WARN] DNS record with id %s doesn't exist, removing it from state", id)
		d.SetId("")
		return nil
	}

  // update local state omitted for clarity
  return nil
}

So, I decided to return nil instead of an error in case the API returns a HTTP 404. In this case d.SetId("") removed the resource from local state. When the API returns another unhandled response, like a HTTP 500, the client would return an err, the function would return error without removing the resource from local state.

If you don’t like returning nil if a resource doesn’t exists, use an error to signal a resource doesn’t exist anymore. Whatever you do, only remove resources from local state if you 100% sure it actually doesn’t exist anymore.

Logging and Debugging Providers

Terraform includes a framework for writing acceptance tests. Unlike unit tests, these tests really create and destroy real cloud resources. And that’s great because you get immediate feedback, if your provider is compatible with the current version of the cloud service providers’ API. And what happens if a test fails? Of cause, you would see a failed test in the test report, but that’s it.

The terraform testing framework runs a provider in a separate process. Connecting a debugger isn’t easy. There is a GitHub issue where this is discussed. Currently (May 2020), using logging is the preferred way to analyze failing tests.

import "log"

func doHTTPRequest(method string, url string, body io.Reader) (*http.Response, error) {
	client := &http.Client{}
	log.Printf("[DEBUG] HTTP request to API %s %s", method, url)
	req, err := http.NewRequest(method, url, body)
	if err != nil {
		log.Printf("[DEBUG] Error while creating HTTP request to API %s", err)
		return nil, err
	}

	// omitted for clarity
}

Import log and print messages where needed. When running tests, add TF_LOG=debug to the log messages on stdout. You can try this on your own. Clone terraform-provider-hetznerdns and run TF_LOG=debug make testacc.

The approach isn’t super good, but I think it is ok given there are unit tests in place and the API you are using is stable. Then, you don’t have to debug too often.

Other than that, I didn’t what Printfs everywhere in the codebase and decided to only use it where I really need it to figure out why the provider fails. Logging errors when a API calls fail, or what API call was made was really helpful.

How to use a Custom Terraform Provider

There are Terraform providers which are officially tested and released by Hashicorp and listed on the Terraform website. The Terraform Provider Development Program describes the process and is intended for vendors, who want to build Terraform providers for their products which run out-of-the-box.

However, as this Terraform provider is a side project and I don’t work for Hetzner, this isn’t an option. Instead I maintain and release the Terraform provider hetznerdns on my own. This means, that Terraform can’t install the provider automatically, as if you were using the official AWS provider.

In order to use the provider, download a release from GitHub, extract the provider executable and copy it to ~/.terraform.d/plugins.

$ mkdir -p ~/.terraform.d/plugins
$ tar xzf terraform-provider-hetznerdns_1.0.0_linux_amd64.tar.gz
$ mv ./terraform-provider-hetznerdns ~/.terraform.d/plugins

You can read more about plugin locations in the Terraform docs.

Conclusion

Developing this Terraform provider was a lot of fun. I learnt more about Terraform internals, wrote lots of Go code again, and used GitHub Actions for the first time. With this provider, I can now automate the process of creating the infrastructure and setting up a jitsi video conferencing server with Terraform only. You can read about this in the next post.

You can find the provider on GitHub. Give it a try and tell me if you like it and how it can be improved.